Disentanglementによる属性の理解を通じた交差状況語意学習

松井 悠太; 谷口 彰; 萩原 良信; 谷口 忠大

doi:10.11517/pjsai.JSAI2023.0_4H2OS6a04

Abstract

This paper presents a computational model that mimics human word learning through cross-situational learning. Humans acquire word meanings by forming categories based on observed information about attributes like color and shape. The proposed model learns to understand attributes in images and establishes the relationship between attribute categories and words. To achieve this, we combine CSL-PGM, which facilitates cross-situation learning, with β-VAE, which enables unsupervised disentanglement of attributes. In our experiments, we trained the model on a dataset comprising images with five attributes and word sequences. Our model achieved a remarkable attribute comprehension rate of 99.9% for each word. In addition, the model outperformed existing multimodal generative models, achieving an 87.0% correct response rate for inferring images from word sequences.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!