2017 年 83 巻 12 号 p. 1117-1124
This paper proposes a novel approach for video-based person re-identification that exploits deep convolutional neural networks to learn the similarity of persons observed from video camera. By Convolutional Neural Networks (CNN), each video sequence of a person is mapped to a Euclidean space where distances between feature embeddings directly correspond to measures of person similarity. By improved parameter learning method called Entire Triplet Loss, all possible triplets in the mini-batch are taken into account to update network parameters at once. This simple change of parameter updating method significantly improves network training, enabling the embeddings to be further discriminative. Experimental results show that proposed model achieves new state of the art identification rate on iLIDS-VID dataset and PRID-2011 dataset with 78.3%, 83.9% at rank 1, respectively.