2014 Volume 134 Issue 5 Pages 634-642
Remote surveillance of large-scale equipments such as power plants and building complex is important to prevent serious attacks and troubles. Automatic human action recognition can reduce the burdens of the surveillance. Multi-view video is useful for human action recognition, because it provides robustness to the changes of people's appearance by orientation and occlusion. One problem of conventional multi-view action recognition is that it requires both detection and tracking before action recognition. Human pose and motion vary depending on the person's action, and such variances may cause detection and tracking error. To solve this problem, previous work has proposed simultaneous action recognition and location estimation for single-view videos using Hough voting. In this paper, we extend the Hough voting approach to simultaneous multi-view action recognition and location estimation. Our proposed method independently casts votes for the action labels and spatio-temporal reference positions of actions in each view and integrates them using homographical transformations in the multi-view extension. We evaluated our method and confirmed that it achieved high accuracy in action recognition and location estimation. The contribution of this paper is that it enables multi-view action recognition without prior human detection and tracking.
The transactions of the Institute of Electrical Engineers of Japan.C
The Journal of the Institute of Electrical Engineers of Japan