2020 Volume E103.D Issue 10 Pages 2154-2161
The task of image annotation is becoming enormously important for efficient image retrieval from the web and other large databases. However, huge semantic information and complex dependency of labels on an image make the task challenging. Hence determining the semantic similarity between multiple labels on an image is useful to understand any incomplete label assignment for image retrieval. This work proposes a novel method to solve the problem of multi-label image annotation by unifying two different types of Laplacian regularization terms in deep convolutional neural network (CNN) for robust annotation performance. The unified Laplacian regularization model is implemented to address the missing labels efficiently by generating the contextual similarity between labels both internally and externally through their semantic similarities, which is the main contribution of this study. Specifically, we generate similarity matrices between labels internally by using Hayashi's quantification method-type III and externally by using the word2vec method. The generated similarity matrices from the two different methods are then combined as a Laplacian regularization term, which is used as the new objective function of the deep CNN. The Regularization term implemented in this study is able to address the multi-label annotation problem, enabling a more effectively trained neural network. Experimental results on public benchmark datasets reveal that the proposed unified regularization model with deep CNN produces significantly better results than the baseline CNN without regularization and other state-of-the-art methods for predicting missing labels.