Interactive Instruction FollowingのためのNeuro-Symbolic手法による多様な物体と言語指示への頑健性の向上

篠田 一聡; 竹澤 祐貴; 鈴木 雅大; 岩澤 有祐; 松尾 豊

doi:10.11517/pjsai.JSAI2021.0_2J3GS8b03

Abstract

Instruction following is a task for learning to transform natural language instructions into a sequence of actions in visual environments. Recently, an interactive instruction following task has been proposed to encourage research in following natural language instructions that require interactions with objects. We observe that an existing model for this task is not robust to variations of objects and instructions, which may cause a serious problem in real-world applications. We assume that this is due to the high sensitiveness of neural feature extraction to small perturbations in vision and language. We propose a Neuro-Symbolic approach to mitigate the lack of robustness. Concretely, we introduce object detection and semantic parsing modules to this task and make reasoning over symbolic features feasible. Our experiments on the ALFRED dataset show that our approach significantly improves the performance on subtasks that require object interactions.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!