ITE Transactions on Media Technology and Applications
Online ISSN : 2186-7364
ISSN-L : 2186-7364
Special Section on Sports Information Processing Technology and Its Application
[Papers] Multimodal Important Scene Detection in Far-view Soccer Videos Based on Single Deep Neural Architecture
Tomoki HaruyamaSho TakahashiTakahiro OgawaMiki Haseyama
Author information
JOURNAL FREE ACCESS

2020 Volume 8 Issue 2 Pages 89-99

Details
Abstract

The details of the matches of soccer can be estimated from visual and audio sequences, and they correspond to the occurrence of important scenes. Therefore, the use of these sequences is suitable for important scene detection. In this paper, a new multimodal method for important scene detection from visual and audio sequences in far-view soccer videos based on a single deep neural architecture is presented. A unique point of our method is that multiple classifiers can be realized by a single deep neural architecture that includes a Convolutional Neural Network-based feature extractor and a Support Vector Machine-based classifier. This approach provides a solution to the problem of not being able to simultaneously optimize different multiple deep neural architectures from a small amount of training data. Then we monitor confidence measures output from this architecture for the multimodal data and enable their integration to obtain the final classification result.

Content from these authors
© 2020 The Institute of Image Information and Television Engineers
Previous article Next article
feedback
Top