IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Contextual Max Pooling for Human Action Recognition
Zhong ZHANGShuang LIUXing MEI
Author information
JOURNAL FREE ACCESS

2015 Volume E98.D Issue 4 Pages 989-993

Details
Abstract

The bag-of-words model (BOW) has been extensively adopted by recent human action recognition methods. The pooling operation, which aggregates local descriptor encodings into a single representation, is a key determiner of the performance of the BOW-based methods. However, the spatio-temporal relationship among interest points has rarely been considered in the pooling step, which results in the imprecise representation of human actions. In this paper, we propose a novel pooling strategy named contextual max pooling (CMP) to overcome this limitation. We add a constraint term into the objective function under the framework of max pooling, which forces the weights of interest points to be consistent with their probabilities. In this way, CMP explicitly considers the spatio-temporal contextual relationships among interest points and inherits the positive properties of max pooling. Our method is verified on three challenging datasets (KTH, UCF Sports and UCF Films datasets), and the results demonstrate that our method achieves better results than the state-of-the-art methods in human action recognition.

Content from these authors
© 2015 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top