Host: The Japanese Society for Artificial Intelligence
Name : The 32nd Annual Conference of the Japanese Society for Artificial Intelligence, 2018
Number : 32
Location : [in Japanese]
Date : June 05, 2018 - June 08, 2018
Turn-taking state estimation to determine utterance timing of a spoken dialog system is discussed. We propose the recurrent neural network based method to estimate user's turn-taking state incrementally. The proposed method utilizes acoustic feature extracted using a spectrogram autoencoder as well as linguistic feature extracted from a partial speech recognition result using a neural network based language model. The article shows an example of estimation result and discuss the performance of the proposed method.