複数人の音声データから各話者の感情を推定する手法

岡崎 貫治; 綿貫 啓一

doi:10.1299/jsmemecj.2024.S121-02

抄録

Currently, many technologies have been proposed to identify speakers and transcribe speech from multi-person conversation data, such as meetings, and multiple services are commercially available from various companies. These services are continuously being improved, and the accuracy of speaker identification and transcription is also improving. Furthermore, some services attempt to estimate the emotions of individual speakers. However, such emotion estimation is limited to scenarios involving one-on-one audio data, such as call centers, and does not extend to estimating emotions from multi-person conversation data. Understanding whether a conversation was heated or not from multi-person conversation data remains limited to textual information, making it difficult to accurately infer the emotional context. Therefore, this study attempts to identify speakers from multi-person conversation data, separate or capture the audio data, and estimate the emotions of the identified speakers. As a result, it was possible to identify sections where any given speaker spoke alone and to estimate the emotions in those identified sections.

著者関連情報

お気に入り & アラート

閲覧履歴

Cultural Attributes and Traditional Knowledge in Connection with the Rearing of Muga (Antheraea assama = assamensis) in the Dhemaji District of Assam, North-East India

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

前身誌

年次大会講演資料集

年次大会資料集

年次大会講演論文集

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）