Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
32nd (2018)
Session ID : 3L2-05
Conference information

Embedding and retrieval of images and text data using probability distribution
*Kenta HAMATakashi MATSUBARAKuniaki UEHARA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Multimodal data including images, sounds, texts is accumulated on the Internet. We can expect general-purpose data representation to perform tasks such as data discrimination, generation, and retrieval on various modalities datasets. The key idea for acquiring the representation is embedding a point from a data space of each modality in a point of common space. However, if data is embedded in a point, it becomes difficult to interpret the ambiguity of the data's meaning and the inclusive relation among the data. Of course, representation of data point does not necessarily need to be a point. In this study, we embed image and text into a normal distribution in a common space. This improves the performance of image retrieval.

Content from these authors
© 2018 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top