2021 年 9 巻 1 号 p. 42-53
A new method that generates user-selectable event summaries from unedited raw soccer videos is presented in this paper. Since there are more unedited raw soccer videos than broadcasted/distributed soccer videos and unedited videos have various viewers, it is necessary to analyze these videos for meeting the demands of various viewers. The proposed method introduces a multimodal CNN-BiLSTM architecture for analyzing unedited raw soccer videos. This architecture extracts candidate scenes for event summarization from unedited soccer videos and classifies these scenes into typical events. Finally, our method generates user-selectable event summaries by simultaneously considering the importance of candidate scenes and the event classification results. Experimental results using real unedited raw soccer videos show the effectiveness of our method.