Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 1Q3-GS-11-03
Conference information

Multimodal Learning by Interaction between Probabilistic and Deep Generative Models
*Ryo KUNIYASUTomoaki NAKAMURATakayuki NAGAITadahiro TANIGUCHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

To realize human-like intelligence artificially, large-scale models are required for robots to understand their environment using multimodal information obtained by various sensors installed in the robots. Therefore, we have proposed models that enable robots to acquire languages and concepts by classifying the multimodal information. These models learn the relationship between the extracted features of each set of modality information based on the multimodal latent Dirichlet allocation (MLDA) in an unsupervised manner. However, this does not provide completely unsupervised learning because the feature extraction includes supervised learning. Moreover, the observations themselves cannot be generated because the feature extraction is irreversible. Therefore, in this study, we propose the multinomial variational autoencoder (MNVAE), and construct a model that integrates the MNVAE and MLDA using Symbol Emergence in Robotics tool KIT. We classify the multimodal information of images and words obtained from a robot using the integrated model, and subsequently demonstrate that the latent space suitable for classification can be learned and images can be generated from words.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top