Host: The Japanese Society for Artificial Intelligence
Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 37
Location : [in Japanese]
Date : June 06, 2023 - June 09, 2023
Multimodal variational autoencoders can acquire a latent representation that integrates information from all modalities by learning an inference model. However, when we want to obtain the shared representation from an arbitrary modality, other modality inputs are missing, which prevents proper inference of the representation. In this study, we reconsider the missing modality problem as part of the amortization gap between amortization inference from any modality and multimodal ELBO, and propose a method to appropriately obtain a shared representation from a single modality input by using iterative amortized inference. However, since multimodal ELBO must be evaluated in the process of iterative amortized inference, missing modality inputs are also required. We, therefore, prepare an inference model that takes only the modality to be inferred as input, distill iterative amortized inference as the teacher and the newly prepared inference model as the student, and verify that an inference model that can acquire a shared representation from a single modality is obtained.