マルチモーダル特徴量を用いた談話セグメントの検出

冨山 健; 二瓶 芙巳雄; 高瀬 裕; 中野 有紀子

doi:10.11517/pjsai.JSAI2019.0_4F3OS11b02

33rd (2019)

Session ID : 4F3-OS-11b-02

DOI https://doi.org/10.11517/pjsai.JSAI2019.0_4F3OS11b02

Conference information

Host: The Japanese Society for Artificial Intelligence

Name : The 33rd Annual Conference of the Japanese Society for Artificial Intelligence, 2019

Number : 33

Location : [in Japanese]

Date : June 04, 2019 - June 07, 2019

Identifying Discourse Boundaries in Group Discussions using Multimodal Features

Ken TOMIYAMA, *Fumio NIHEI, Yutaka TAKASE, Yukiko NAKANO

Author information

Keywords: conversation segmentation, multimodal, group discussion

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

This study proposes models for detecting conversation boundaries in group discussions. First, we created a multimodal embedding space using an autoencoder, and applied a similarity-based approach to detect the discussion boundary. As the second method, we annotated conversation boundaries and created unimodal CNN models for language, audio, and head motion information. Then, created multimodal models by concatenating the output of unimodal models. In the evaluation experiment, we found that language information was the most useful modality, but by combining with audio and head motion modalities, the CNN-based models more accurately predict the conversation boundaries.

Corresponding author

Conference information

Register with J-STAGE for free!