Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 2M1-OS-19a-03
Conference information

A Deep Generative Model for Extracting Shared and Private Latent Representations from Multimodal Data
*Kaito KUSUMOTOShingo MURATA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Representation learning of multi-modal data has a potential to understand a shared structure across modalities. The objective of this study is to develop a computational framework that can learn to extract latent representations from multi-modal data by using a deep generative model. A particular modality is considered to hold low-dimensional latent representations; however, these representations are not always fully shared with another modality. Therefore, we assume that each modality holds both shared and private latent representations. Under this assumption, we propose a deep generative model that can learn to extract these different latent representations from both non-time-series and time-series data in an end-to-end manner. To evaluate this framework, we conducted a simulation experiment in which an artificial multi-modal dataset consisting of images and strokes with shared and private information was utilized. Experimental results demonstrate that the proposed framework successfully learned to extract both the shared and private latent representations.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top