Journal of Advanced Computational Intelligence and Intelligent Informatics
Online ISSN : 1883-8014
Print ISSN : 1343-0130
ISSN-L : 1883-8014
Regular Papers
Multimodal Analogy-Based Image Retrieval by Improving Semantic Embeddings
Kosuke OtaKeiichiro ShiraiHidetoshi MiyaoMinoru Maruyama
Author information
JOURNAL OPEN ACCESS

2022 Volume 26 Issue 6 Pages 995-1003

Details
Abstract

In this work, we study the application of multimodal analogical reasoning to image retrieval. Multimodal analogy questions are given in a form of tuples of words and images, e.g., “cat”:“dog”::[an image of a cat sitting on a bench]:?, to search for an image of a dog sitting on a bench. Retrieving desired images given these tuples can be seen as a task of finding images whose relation between the query image is close to that of query words. One way to achieve the task is building a common vector space that exhibits analogical regularities. To learn such an embedding, we propose a quadruple neural network called multimodal siamese network. The network consists of recurrent neural networks and convolutional neural networks based on the siamese architecture. We also introduce an effective procedure to generate analogy examples from an image-caption dataset for training of our network. In our experiments, we test our model on analogy-based image retrieval tasks. The results show that our method outperforms the previous work in qualitative evaluation.

Content from these authors

This article cannot obtain the latest cited-by information.

© 2022 Fuji Technology Press Ltd.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII official website.
https://www.fujipress.jp/jaciii/jc-about/#https://creativecommons.org/licenses/by-nd
Previous article Next article
feedback
Top