2018 Volume 84 Issue 12 Pages 983-990
In this paper, we report our efforts and challenges on the TRECVID ad-hoc video search (AVS) task. The goal of the AVS task it to build a zero-shot video retrieval system using a complicated query phrase. Our system has the following two characteristics. First, we prepared a large number of pre-trained concept classifiers in advance that can detect various kinds of objects, persons, scenes, and actions. This strategy contributes to improve the word coverage rate of keywords in query phrases. Second, we selected additional concept classifiers by natural language processing techniques such as using word similarities or synonyms. We submitted our systems with these two characteristics to the TRECVID AVS task in 2016 and 2017, and one of our systems ranked the highest among all the submitted systems for the second consecutive year.