IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Finding Important People in a Video Using Deep Neural Networks with Conditional Random Fields
Mayu OTANIAtsushi NISHIDAYuta NAKASHIMATomokazu SATONaokazu YOKOYA
Author information
JOURNAL FREE ACCESS

2018 Volume E101.D Issue 10 Pages 2509-2517

Details
Abstract

Finding important regions is essential for applications, such as content-aware video compression and video retargeting to automatically crop a region in a video for small screens. Since people are one of main subjects when taking a video, some methods for finding important regions use a visual attention model based on face/pedestrian detection to incorporate the knowledge that people are important. However, such methods usually do not distinguish important people from passers-by and bystanders, which results in false positives. In this paper, we propose a deep neural network (DNN)-based method, which classifies a person into important or unimportant, given a video containing multiple people in a single frame and captured with a hand-held camera. Intuitively, important/unimportant labels are highly correlated given that corresponding people's spatial motions are similar. Based on this assumption, we propose to boost the performance of our important/unimportant classification by using conditional random fields (CRFs) built upon the DNN, which can be trained in an end-to-end manner. Our experimental results show that our method successfully classifies important people and the use of a DNN with CRFs improves the accuracy.

Content from these authors
© 2018 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top