Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
33rd (2019)
Session ID : 2D3-E-4-03
Conference information

A Multimodal Target-Source Classifier Model for Object Fetching from Natural Language Instructions
*Aly MAGASSOUBAKomei SUGIURAHisashi KAWAI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this paper, we address the fetching task from ambiguous instructions. A typical fetching task consists of picking up a target object specified by ambiguous instructions. We specifically propose a multimodal target-source classifier model (MTCM) that grounds the instructions in the scene. More explicitly, MCTM can predict the likelihood of a target object in addition to the source of this target using linguistic and visual features. Our approach improves the accuracy of the previous state-of-the-art method for target object prediction in fetching task.

Content from these authors
© 2019 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top