Host: The Japanese Society for Artificial intelligence
Name : The 98th SIG-SLUD
Number : 98
Location : [in Japanese]
Date : September 03, 2023 - September 04, 2023
Pages 59-65
In recent years, remote meeting has become widely used, and solutions for analyzing spoken dialogue are becoming more widespread.For example, in the context of business negotiations, a key focus lies in developing solutions to assess conversational skills through the extraction and summarization of linguistic features. Additionally, paralinguistic information holds significant value in analyzing spoken dialogue as it offers insights into a speaker's impressions and emotions,manifesting in various conversational cues such as laughter, filler words, and gasps. In this work, for the purpose of detecting laughter, we propose a detection system based on a pre-trained speech recognition model and a semi-supervised learning method using weakly labeled data in which the laughter interval is unclear. We conduct experiments on a corpus of Japanese conversation and show that the proposed method outperforms conventional methods. Moreover, we discuss the diversity of speakers, cultures, and sound collection environments as factors affecting the laughter detection performance.