Host: The Japanese Society for Artificial Intelligence
Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 37
Location : [in Japanese]
Date : June 06, 2023 - June 09, 2023
We propose a method to generate scene graphs using optimal transport loss as a measure to compare two probability distributions. In scene graph generation, learning with cross-entropy loss leads to biased predictions because the distribution of predicate labels in the dataset has severe imbalance. We apply learning with the optimal transport loss, which easily reflects similarity between labels as transportation cost, to the predicate classification in scene graph generation. In the proposed method, the transportation cost of the optimal transport is defined using the similarity of words obtained from the pre-trained model. The experimental evaluation of the effectiveness shows that the method achieves better performance than existing models in terms of mean Recall@50 and mean Recall@100. Furthermore, it can improve recall of predicate labels that are scarce in the dataset.