確率モデルと深層生成モデルの相互作用によるマルチモーダル学習

國安 瞭; 中村 友昭; 長井 隆行; 谷口 忠大

doi:10.11517/pjsai.JSAI2020.0_1Q3GS1103

Abstract

To realize human-like intelligence artificially, large-scale models are required for robots to understand their environment using multimodal information obtained by various sensors installed in the robots. Therefore, we have proposed models that enable robots to acquire languages and concepts by classifying the multimodal information. These models learn the relationship between the extracted features of each set of modality information based on the multimodal latent Dirichlet allocation (MLDA) in an unsupervised manner. However, this does not provide completely unsupervised learning because the feature extraction includes supervised learning. Moreover, the observations themselves cannot be generated because the feature extraction is irreversible. Therefore, in this study, we propose the multinomial variational autoencoder (MNVAE), and construct a model that integrates the MNVAE and MLDA using Symbol Emergence in Robotics tool KIT. We classify the multimodal information of images and words obtained from a robot using the integrated model, and subsequently demonstrate that the latent space suitable for classification can be learned and images can be generated from words.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!