Learning to Rank Physical Objects: ランキング学習による物理世界検索エンジン

兼田 寛大; 神原 元就; 杉浦 孔明

doi:10.11517/pjsai.JSAI2023.0_3G1OS24a01

37th (2023)

Session ID : 3G1-OS-24a-01

DOI https://doi.org/10.11517/pjsai.JSAI2023.0_3G1OS24a01

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence

Number : 37

Location : [in Japanese]

Date : June 06, 2023 - June 09, 2023

Finding Everyday Objects Using Physical-World Search Engines: a Learning–To–Rank Approach

*Kanta KANEDA, Motonari KAMBARA, Komei SUGIURA

Author information

Keywords: Learning to Rank, Multimodal Language Processing, Learning to Rank Physical Objects Task

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In this study, we focus on the learning-to-rank physical objects task, which involves retrieving target objects from open-vocabulary user instructions in a human-in-the-loop setting. We propose MultiRankIt, which introduces the Crossmodal Noun Phrase Encoder to model the relationship between referring expressions and target bounding box, and the Crossmodal Region Feature Encoder to model the relationship between the target object and its surrounding contextual environment. Our model outperforms the baseline method in terms of mean reciprocal rank and recall@K.

Corresponding author

Conference information

Register with J-STAGE for free!