Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
36th (2022)
Session ID : 2O4-GS-7-01
Conference information

Instruction Comprehension Based on Funnel UNITER for Object Manipulation Tasks
*Yu YOSHIDAShintaro ISHIKAWAKomei SUGIURA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this study, we develop a multimodal language comprehension model that allows domestic service robots to understand object fetching instructions. We propose a multimodal language understanding model, Funnel UNITER, which gradually reduces the dimensions of the query, key, and value in each transformer layer to reduce the computational cost of self-attention. We also built a new dataset for the multimodal language understanding for fetching instruction (MLU-FI) task called the ALFRED-fetch dataset. Our model outperformed the baseline method in both classification accuracy and training time.

Content from these authors
© 2022 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top