半教師ありマルチモーダル深層生成モデルにおける共有表現の有効性と単一モダリティ入力への拡張

鈴木 雅大; 松尾 豊

doi:10.11517/pjsai.JSAI2018.0_4A102

Abstract

In recent multimodal learning, deep neural networks are increasingly used as discriminators. In general, we need a large amount of labeled dataset for training them, but it takes a human cost to label multimodal inputs. Therefore, semi-supervised learning on multimodal data becomes important. Among these methods, semi-supervised multimodal learning with deep generative models has recently been proposed. In this study, we first compare these methods and show that SS-HMVAE, which is a method with latent variables corresponding to joint representation, have high performance when different modalities have no deterministic relation in particular. Next, to predict labels from a unimodal data, we propose SS-HMVAE-kl that is an extended model of SS-HMVAE. We confirmed that this method greatly improves the performance when inputting a single modality compared with the conventional models.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!