Host: The Japanese Society for Artificial Intelligence
Name : The 32nd Annual Conference of the Japanese Society for Artificial Intelligence, 2018
Number : 32
Location : [in Japanese]
Date : June 05, 2018 - June 08, 2018
In recent multimodal learning, deep neural networks are increasingly used as discriminators. In general, we need a large amount of labeled dataset for training them, but it takes a human cost to label multimodal inputs. Therefore, semi-supervised learning on multimodal data becomes important. Among these methods, semi-supervised multimodal learning with deep generative models has recently been proposed. In this study, we first compare these methods and show that SS-HMVAE, which is a method with latent variables corresponding to joint representation, have high performance when different modalities have no deterministic relation in particular. Next, to predict labels from a unimodal data, we propose SS-HMVAE-kl that is an extended model of SS-HMVAE. We confirmed that this method greatly improves the performance when inputting a single modality compared with the conventional models.