Journal of Robotics and Mechatronics
Online ISSN : 1883-8049
Print ISSN : 0915-3942
ISSN-L : 0915-3942
Special Issue on Advanced Robotic Technology and System for DX in Construction Industry
Automatic Findings Generation for Distress Images Using In-Context Few-Shot Learning of Visual Language Model Based on Image Similarity and Text Diversity
Yuto WatanabeNaoki OgawaKeisuke MaedaTakahiro OgawaMiki Haseyama
Author information
JOURNAL OPEN ACCESS

2024 Volume 36 Issue 2 Pages 353-364

Details
Abstract

This study proposes an automatic findings generation method that performs in-context few-shot learning of a visual language model. The automatic generation of findings can reduce the burden of creating inspection records for infrastructure facilities. However, the findings must include the opinions and judgments of engineers, in addition to what is recognized from the image; therefore, the direct generation of findings is still challenging. With this background, we introduce in-context few-short learning that focuses on image similarity and text diversity in the visual language model, which enables text output with a highly accurate understanding of both vision and language. Based on a novel in-context few-shot learning strategy, the proposed method comprehensively considers the characteristics of the distress image and diverse findings and can achieve high accuracy in generating findings. In the experiments, the proposed method outperformed the comparative methods in generating findings for distress images captured during bridge inspections.

Content from these authors

This article cannot obtain the latest cited-by information.

© 2024 Fuji Technology Press Ltd.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JRM official website.
https://www.fujipress.jp/jrobomech/rb-about/#https://creativecommons.org/licenses/by-nd
Previous article Next article
feedback
Top