Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
With the development of digital terminals such as smartphones and tablets,videos that available for users to watch has reached to an enormous amount. In this context, applications such as classification, retrieval and distribution of personalized video content to meet the consumer needs remains a challenge. In general, humans tend to choose movies and music based on emotional characteristics. Therefore, evoked emotion analyzing may provide a guideline for this task. Emotions evoked by video are related to both audio and video modalities. In this study, we propose a deep learning model that estimates movie-evoked emotion by integrating multimodal information. Experiments using a movie database verify the change in estimation performance due to the integration of multimodal information, and show that the accuracy is improved compared to the conventional method. In addition, we analyze Autonomous Sensory Meridian Response (ASMR) videos, which have recently become a hot topic, and examine the relationship between evoked emotion and viewer behavior such as the number of views and likes/dislikes rates.