Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
Large language models (LLMs) are flexible and can handle various natural language processing tasks. Many spoken dialogue systems are realized by linking a dialogue model built using an LLM with other modules, such as speech recognition or synthesis systems. However, such a cascaded model with multiple modules is complicated and tends to propagate errors from the previous module. The model can also not consider sensitive expressions in the non-verbal representation of dialogue because the discrete representation, such as texts, is used to connect modules. This research aims to solve these problems by converting the input speech into a vector of continuous expressions and connecting it to a dialogue model. The experimental results show that the generated sentences do not fully take the dialogue context into account, and there is room for improvement, but the natural sentence generation is learned, suggesting that a dialogue model using continuous expressions is feasible.