Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 1G4-OS-21a-01
Conference information

Active Exploration Method for Simultaneous Learning of Maps and Multimodal Spatial Concepts and Utilization of the Foundation Model
*Tomochika ISHIKAWAAkira TANIGUCHIYoshinobu HAGIWARATadahiro TANIGUCHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In order for a robot to perform tasks related to human language, it needs to have a Semantic Map that maps semantic information about locations. Learning such a map often requires human intervention. In this study, we propose an active semantic mapping system by a robot that does not require human intervention, thereby reducing the burden on the user in the semantic mapping process. In this paper, we propose a method in which a robot actively learns spatial concepts and generates maps at the same time. Learning of spatial concepts is achieved through multimodal categorization using unsupervised online learning. Captions generated by CLIP, the underlying model for image captioning, are used to map the real world to the language. In order to evaluate what kind of spatial search method leads to efficient semantic mapping, we conducted experiments in a simulation environment using comparison methods which use different methods for determining the destination. We also evaluated the usefulness of the learning results for human language-related tasks in a real-world environment.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top