Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 3G5-GS-6-02
Conference information

A Speech Dialogue System Utilizing Voice Activity Prediction and Objective Evaluation of Naturalness
*Eisaku HIGUCHITomoyuki YAMAMOTOShigeto YOSHIDA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

With the advancement of natural language processing technologies, dialogue systems that handle continuous speech are becoming increasingly prevalent. In particular, the responses of dialogue systems that provide backchanneling can disrupt natural conversation due to delays in response speed and interruptions during speech. However, evaluating these systems is challenging because it is difficult to separate backchanneling from the main dialogue. In this study, we focus on turn-taking to achieve natural interactions that include backchanneling, and we have developed a dialogue system utilizing Voice Activity Projection (VAP). This system predicts the start and end times of conversations, allowing for the distinction between backchanneling and interruptive speech. Experiments have confirmed improvements in naturalness, indicating its effectiveness for future dialogue system development.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top