精密工学会誌
Online ISSN : 1882-675X
Print ISSN : 0912-0289
ISSN-L : 0912-0289
論文
大規模視覚言語モデルによる個人ごとの好み推定と判断根拠生成
吉田 温登滝 之弥加藤 邦人寺田 和憲
著者情報
ジャーナル フリー

2025 年 91 巻 3 号 p. 411-417

詳細
抄録

This study employs Large Vision-Language Model that integrates images and text to learn individuals' preferences and the rationales behind their judgments from a small dataset. Existing methods for estimating preferences and impressions have handled these by quantifying them; however, quantified preferences and impressions are difficult to interpret and lack explainability. Therefore, this research aims to enhance the explainability of preferences by generating the rationale behind preference judgments. We collected individual datasets by gathering personal preferences ("preferable" or "not preferable") and their rationales in specific domains and attempted to implement this by further training the Large Vision-Language Model. Experiments confirmed that, despite the dataset's limitation to only 200 images, it is feasible to estimate individual preferences and generate their rationales.

著者関連情報
© 2025 公益社団法人 精密工学会
前の記事 次の記事
feedback
Top