道路環境リスク分析のためのプロンプトエンジニアリングを用いたキャプションデータの生成

石川 敦也; 井上 顧基; 下村 晃太; 大森 一祥; 下垣内 隆太; 若林 怜帆人; 三村 崚太; 伊藤 修

doi:10.11517/pjsai.JSAI2024.0_1D5GS1004

Abstract

With the spread of driver assistance systems and autonomous driving technologies, their effectiveness in reducing traffic accidents has been discussed. However, for a further reduction of accidents, it is crucial to explain traffic accident risks and analyze their mechanisms. Research on explainable multimodal networks for driving scenes has attempted methods for generating captions by considering recognizable objects using metadata. Such methods typically focus on generating captions for dynamic objects, like humans. However, to explain traffic accident risks in driving scenes, static risks caused by road signs and road structures should also be considered during caption generation. Existing large-scale multimodal networks face difficulties in generating captions that address these types of road environment risks. To tackle this challenge, we propose a caption generation method that leverages prompt engineering to include both dynamic objects and static potential risks. Additionally, experiments using the generated captions confirmed the capability of producing captions that consider both dynamic objects and static potential risks.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!