Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 4H2-OS-6a-05
Conference information

Effectiveness of Joint Attention in Deep Learning for Generating Language Describing Actions
Fumimaro ODAKURA*Kei WAKABAYASHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Joint attention is said to play an important role in human language learning. Recently, research has been conducted on the use of joint attention for language understanding in artificial intelligence. However, previous studies only show the effectiveness of joint attention in mapping words to objects in images without motion, and the use of joint attention in mapping sentences to the actions of objects in image sequences (videos) has not been investigated. In this study, we designed a task that takes an image sequence depicting agents moving on a 2-D board and generates natural language sentences representing the subject and its actions. We propose a deep learning method that uses the trainer's joint attention for this task. Experimental results using synthetic joint attention show the accuracy was significantly improved when joint attention was used during training and testing, while it was not improved when joint attention was used only during training.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top