A Multimodal Target-Source Classifier Model for Object Fetching from Natural Language Instructions

Aly MAGASSOUBA; Komei SUGIURA; Hisashi KAWAI

doi:10.11517/pjsai.JSAI2019.0_2D3E403

33rd (2019)

セッションID: 2D3-E-4-03

DOI https://doi.org/10.11517/pjsai.JSAI2019.0_2D3E403

会議情報

主催: The Japanese Society for Artificial Intelligence

会議名: 2019年度人工知能学会全国大会（第33回）

回次: 33

開催地: 新潟県新潟市朱鷺メッセ

開催日: 2019/06/04 - 2019/06/07

A Multimodal Target-Source Classifier Model for Object Fetching from Natural Language Instructions

*Aly MAGASSOUBA, Komei SUGIURA, Hisashi KAWAI

著者情報

キーワード: Deep Learning in Robotics and Automation, Spoken Language understanding, Domestic Robots

会議録・要旨集フリー

詳細

抄録

In this paper, we address the fetching task from ambiguous instructions. A typical fetching task consists of picking up a target object specified by ambiguous instructions. We specifically propose a multimodal target-source classifier model (MTCM) that grounds the instructions in the scene. More explicitly, MCTM can predict the likelihood of a target object in addition to the source of this target using linguistic and visual features. Our approach improves the accuracy of the previous state-of-the-art method for target object prediction in fetching task.

責任著者(Corresponding author)

会議情報

J-STAGEへの登録はこちら（無料）