モンテカルロ木探索を用いた個人性のある画像キャプション生成

吉田 司; 新堀 和紀; 深山 篤

doi:10.11517/jsaislud.100.0_01

Abstract

This study aims to generate personalized descriptions in image captioning, incorporating individual perspectives and phrasing. With the progress in large language models, achieving notable results in various language tasks is possible. For text generation that reflects individuality, adjusting the language model using limited data from individuals is a challenge. This paper proposes using a personal identification model trained on minimal data combined with Monte Carlo tree search to explore token generation sequences. We demonstrate that this method can produce a broader range of sentences than standard beam search and effectively replicate individuality.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!