電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<知能,ロボティクス>
音声と画像シーンを用いた潜在意味解析に基づくタスク推定
木村 優志澤田 心大入部 百合絵桂田 浩一新田 恒雄
著者情報
ジャーナル フリー

2012 年 132 巻 9 号 p. 1473-1480

詳細
抄録

In this paper, we propose a task estimation method based on multiple subspaces extracted from multi-modal information of image objects in visual scenes and spoken words in dialog appeared in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image-object appearances are extracted first, and then similarities among the input task vector and reference sub-spaces of different tasks are compared. Experiments are conducted on the identification of game tasks. Experimental results show that the proposed method with multi-modal information outperforms the method in which only single modality of image or spoken dialog is applied. Moreover, the proposed method achieved accurate performance even if less spoken dialog is applied.

著者関連情報
© 2012 電気学会
前の記事 次の記事
feedback
Top