Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
In recent years, image captioning where a caption of an image is generated using deep learning networks hasbeen actively studied. In the early studies, image features extracted from an image are used, but studies usingscene graphs has also been studied aiming at generating captions with attributes and relations of people andobjects in an image. In this study, we conduct a user study to evaluate whether scene graph truly helps to enrichcaptioning. Through user evaluation questionnaires by crowdsourcing, we got a result that scene graph features donot apparently help to enrich captioning, but they are not harmful either.