IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
VisualTextualRank: An Extension of VisualRank to Large-Scale Video Shot Extraction Exploiting Tag Co-occurrence
Nga H. DOKeiji YANAI
Author information

2015 Volume E98.D Issue 1 Pages 166-172


In this paper, we propose a novel ranking method called VisualTextualRank which ranks media data according to the relevance between the data and specified keywords. We apply our method to the system of video shot ranking which aims to automatically obtain video shots corresponding to given action keywords from Web videos. The keywords can be any type of action such as “surfing wave” (sport action) or “brushing teeth” (daily activity). Top ranked video shots are expected to be relevant to the keywords. While our baseline exploits only visual features of the data, the proposed method employs both textual information (tags) and visual features. Our method is based on random walks over a bipartite graph to integrate visual information of video shots and tag information of Web videos effectively. Note that instead of treating the textual information as an additional feature for shot ranking, we explore the mutual reinforcement between shots and textual information of their corresponding videos to improve shot ranking. We validated our framework on a database which was used by the baseline. Experiments showed that our proposed ranking method, VisualTextualRank, improved significantly the performance of the system of video shot extraction over the baseline.

Information related to the author
© 2015 The Institute of Electronics, Information and Communication Engineers
Previous article Next article