Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 2E5-GS-6-01
Conference information

Improving Caption Generation Performance using Diffusion Process
*Satoko HIRANOIchiro KOBAYASHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In recent years, generative models using diffusion process have achieved the state-of-the-art performance in the continuous domain and have been actively studied in discrete data generation. In this study, we propose caption generation using a language model and a classifier based on diffusion process. To improve the performance of caption generation, we examine the difference in accuracy with and without a pre-trained language model in the classifier, and investigate under what conditions appropriate captions can be generated for each image. Although the accuracy of our method using diffusion process was not good, we have confirmed that natural language generation could be controlled by the performance of a classifier in the sampling process.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top