Transactions of Society of Automotive Engineers of Japan
Online ISSN : 1883-0811
Print ISSN : 0287-8321
ISSN-L : 0287-8321
Improving the Efficiency of Traffic Scene Retrieval by Vision-Language Model
Masafumi TsuyukiYoshitaka AtarashiTrongmun Jiralerspong
Author information
JOURNAL FREE ACCESS

2024 Volume 55 Issue 6 Pages 1139-1144

Details
Abstract
Retrieving relevant traffic scene data from existing database is essential in the development of advanced driver-assistance systems but such task is time consuming and computationally expensive. This study proposes a traffic scene retrieval system that utilizes a vision-language model and clustering techniques. The proposed system is capable of executing data retrieval task by inputting an image data or text as a search query. Evaluation results showed that the system was able to retrieve complex scene data(e.g., traffic congestion) from a driving video database under 3 seconds. Overall, the results indicate that the prosed system is feasible for practical applications.
Content from these authors
© 2024 Society of Automotive Engineers of Japan, Inc.
Previous article Next article
feedback
Top