Host: The Japanese Society for Artificial Intelligence
Name : The 100th SIG-SLUD
Number : 100
Location : [in Japanese]
Date : February 29, 2024 - March 01, 2024
Pages 01-06
This study aims to generate personalized descriptions in image captioning, incorporating individual perspectives and phrasing. With the progress in large language models, achieving notable results in various language tasks is possible. For text generation that reflects individuality, adjusting the language model using limited data from individuals is a challenge. This paper proposes using a personal identification model trained on minimal data combined with Monte Carlo tree search to explore token generation sequences. We demonstrate that this method can produce a broader range of sentences than standard beam search and effectively replicate individuality.