マルチモーダル深層学習に基づく道路構造・出現物体・操作で構成される運転場面の検出

橋本 幸二郎; 柳原 大地

doi:10.1541/ieejeiss.144.985

抄録

In this paper, we propose a method to detect driving scenes where cognitive function can be evaluated. This method defines assessable scenes as those composed of three elements: road structure, appearing objects, and operations. It detects scenes composed of these three elements, which are arbitrarily set. When detecting targets composed of multiple information sources, and for targets where the pre-description of useful feature vectors is difficult, multimodal deep learning is used. While there are cases where an intermediate fusion model structure is used in existing research, it has been suggested that such models face challenges with hyperparameter tuning and may fail to learn the inter-modality relationships when there are discrepancies in the amount of information each modality provides. Therefore, in this paper, a new model structure that incorporates an attention mechanism into a late fusion model is proposed. This model not only enables individual evaluation of each modality constituting the scenes and achieves the final detection result, but also provides a structure with high readability regarding how the detection results are produced. In experiments, this method is compared in terms of detection accuracy with the intermediate fusion model structure used in existing research, and improvements in both recall and precision were confirmed.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）