Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
In this study, we aim to develop a domestic service robot (DSR) that carries an everyday object to a piece of furniture by retrieving images of target objects and receptacles from collected images of an environment, based on an open-vocabulary instruction. We propose a multimodal model that retrieves both target objects and receptacles individually using a single model based on the switching mechanism via large language models. The experimental results show that our method outperformed baseline methods on the newly-built datasets in terms of the standard metrics. Furthermore, our method achieved task success rates of more than 80% in the physical experiments.