地図とマルチモーダル場所概念の同時学習のための能動探索手法と基盤モデル活用

石川 朋親; 谷口 彰; 萩原 良信; 谷口 忠大

doi:10.11517/pjsai.JSAI2023.0_1G4OS21a01

37th (2023)

Session ID : 1G4-OS-21a-01

DOI https://doi.org/10.11517/pjsai.JSAI2023.0_1G4OS21a01

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 37

Location : [in Japanese]

Date : June 06, 2023 - June 09, 2023

Active Exploration Method for Simultaneous Learning of Maps and Multimodal Spatial Concepts and Utilization of the Foundation Model

*Tomochika ISHIKAWA, Akira TANIGUCHI, Yoshinobu HAGIWARA, Tadahiro TANIGUCHI

Author information

Keywords: Active inference, Semantic mapping, SLAM

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In order for a robot to perform tasks related to human language, it needs to have a Semantic Map that maps semantic information about locations. Learning such a map often requires human intervention. In this study, we propose an active semantic mapping system by a robot that does not require human intervention, thereby reducing the burden on the user in the semantic mapping process. In this paper, we propose a method in which a robot actively learns spatial concepts and generates maps at the same time. Learning of spatial concepts is achieved through multimodal categorization using unsupervised online learning. Captions generated by CLIP, the underlying model for image captioning, are used to map the real world to the language. In order to evaluate what kind of spatial search method leads to efficient semantic mapping, we conducted experiments in a simulation environment using comparison methods which use different methods for determining the destination. We also evaluated the usefulness of the learning results for human language-related tasks in a real-world environment.

Corresponding author

Conference information

Register with J-STAGE for free!