Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 1Q3-GS-11-05
Conference information

Sentence Generation for Fetching Instruction based on Multimodal Attention Branch Network
*Tadashi OGURAAly MAGASSOUBAKomei SUGIURATsubasa HIRAKAWATakayoshi YAMASHITAHironobu FUJIYOSHIHisashi KAWAI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Domestic service robots (DSRs) are a promising solution to the shortage of home care workers. Nonetheless, one of the main limitations of DSRs is their inability to naturally interact through language. Recently, data-driven approaches have been shown to be effective for tackling this limitation, however, they often require large-scale datasets, which is costly. Based on this background, we aim to perform automatic sentence generation for fetching instructions, e.g., ``Bring me a green tea bottle on the table.'' This is particularly challenging because appropriate expressions depend on the target object, as well as its surroundings. In this paper, we propose a method that generates sentences from visual inputs. Unlike other approaches, the proposed method has multimodal attention branches that utilize subword-level attention and generate sentences based on subword embeddings. In the experiment, we compared the proposed method with a baseline method using four standard metrics in image captioning. Experimental results show that the proposed method outperformed the baseline in terms of these metrics.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top