Host: The Japanese Society for Artificial Intelligence
Name : The 36th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 36
Location : [in Japanese]
Date : June 14, 2022 - June 17, 2022
In recent years, research on image caption generation has evolved to include not only the generation of image captions based on information obtained from image preprocessing, but also the generation of captions based on the user's interest in the image by providing additional information corresponding to the viewpoint, called control signals, to the image processing information. In this paper, we propose a new method to generate captions based on the user's interests. In general, when people describe the image, they usually use their fingers to trace the object they want to describe. In this study, we consider tracing the image as a control signal. And, we propose an interactive generating image caption method that is more in line with the explainer by reflecting the meaning of the traces.