Host: The Japanese Society for Artificial Intelligence
Name : The 36th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 36
Location : [in Japanese]
Date : June 14, 2022 - June 17, 2022
For complicated VQA tasks that incorporates multiple objects, to train the VQA model using segmented objects data as inputs is proved to be effective for various downstream tasks. In this work we tried to train the VQA task model and object segmentation model in end-to-end fashion instead of training independently. We employed CLEVRER as a target VQA task. We first trained MONet, an object segmentation network, with the dataset, and trained Aloe, a VQA model, using the output of the trained MONet. Finally we connect MONet ans Aloe to finetune them in end-to-end setting and confirmed that the performance of VQA task has been improved.