複数人対話環境下の発話区間推定における音源位置が精度に与える影響の考察

上村 海斗; 堀尾 恵一

doi:10.14864/fss.40.0_215

40th Fuzzy System Symposium

Session ID : 1G2-3

DOI https://doi.org/10.14864/fss.40.0_215

Conference information

Host: Japan Society for Fuzzy Theory and Intelligent Info rmatics (SOFT)

Name : 40th Fuzzy System Symposium

Number : 40

Location : [in Japanese]

Date : September 02, 2024 - September 04, 2024

proceeding

Effect of Sound Source Location on Accuracy in Diarization in Multi-Person Dialogue Environment

*UEMURA KAITO, HORIO KEIICHI

Author information

CONFERENCE PROCEEDINGS FREE ACCESS

Details

Abstract

In recent years, the importance of a speech segment detection technique called speaker di-(breakpoint)arization has increased, mainly in conferences, news, and telephone calls. However, conventional speaker segmentation detection methods using neural networks require a huge amount of training data. In this study, training data was created by recording the speech of two speakers of the same gender, splitting and combining them to create a synthetic utterance. The effect of the distance and angle to the microphone on the accuracy of the test data was examined in tests with non-synthesized speech.

Corresponding author

Conference information

Register with J-STAGE for free!