精密工学会誌
Online ISSN : 1882-675X
Print ISSN : 0912-0289
ISSN-L : 0912-0289
画像技術の実利用特集論文
汎用外観検査の性能向上に向けた視覚言語モデルの学習と推論法
上野 詩翔林 良和中塚 俊介加藤 邦人尾下 拓未相澤 宏旭
著者情報
ジャーナル フリー

2025 年 91 巻 12 号 p. 1150-1155

詳細
抄録

In this framework, we improve the general visual inspection performance by changing the foundation Vision-Language Model (VLM), reconstructing the fine-tuning dataset, and proposing a selection algorithm for In-Context Learning (ICL). The existing approach using VLM and ICL gives non-defective or defective images and an explanatory description as a prompt to inspect the unknown products without additional parameter updating. However, the foundation VLM used in the existing approach focused on the ICL capability, without considering the local recognition capability. Thus, in this study, we change the foundation VLM to one focused on the local recognition capability. Also, we reconstruct the fine-tuning dataset to enable the model to detect defective coordinates. In addition, during the inference, we propose an example selection algorithm based on the Euclidean distance, and give the ICL example with a visual prompt. The experimental results show that our approach achieved F1-score of 0.950 on MVTec AD in a one-shot manner.

著者関連情報
© 2025 公益社団法人 精密工学会
前の記事 次の記事
feedback
Top