JSAI Technical Report, SIG-SLUD
Online ISSN : 2436-4576
Print ISSN : 0918-5682
99th (Dec.2023)
Conference information

Hagi bot: A Multimodal Dialogue System for Smooth Discussion with Human-like Behavior and Dialogue State Tracking Using LLM
Yuto NAKANOShinnosuke NOZUEKazuma KOKUTATomoki ARIYAMAKai SATOShusaku SONERyohei KAMEISuchun XIEFuka NARITAShoji MORIYAReina AKAMAYuichiroh MATSUBAYASHIKeisuke SAKAGUCHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages 102-107

Details
Abstract

This paper describes "hagi bot," a system submitted to the Sixth Dialogue System Live Competition. It is a task-oriented multi-modal dialogue system that integrates a response generation module and an avatar control module. Within the response generation module, GPT-4 is used to generate responses along with emotion and action labels considering the dialogue history and the topics to be discussed. Specifically, by monitoring the dialogue state through slot filling and continuously changing prompts based on the situation, the module is able to achieve a natural dialogue progression. In the avatar control module, voice (pitch, volume, and speaking speed), facial expressions, and gestures are regulated based on predefined rules designed in reference to models such as Russell's Circumplex Model of Affect, corresponding to content of speech together with emotion and action labels. This approach enables human-like natural behavior. The combination of these two modules achieves generation of responses appropriate to the situation and natural behaviors based on the content of utterances and emotions. This system won the first place in the preliminary round.

Content from these authors
© 2023 The Japaense Society for Artificial Intelligence
Previous article Next article
feedback
Top