-
Rei KAWAKATSU, Toru SUGIMOTO
Article type: SIG paper
Pages
01-06
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
This study aims to construct a listening dialogue system that provides users with higher satisfaction by focusing on active listening skills and the context of the conversation. Our proposed system generates five types of responses (repeated responses, interjections, summarizing responses, follow-up questions, and empathetic responses) based on active listening skills. Furthermore, by utilizing dialogue history for follow-up questions and empathetic responses, it generates contextually appropriate responses. As a result of evaluation experiment having subjects use the proposed system, we confirmed improvements in the system's listening quality of attentiveness, comprehension of the conversation, the user's satisfaction with the dialogue, and willingness to continue using it.
View full abstract
-
Yueliang LIU, Zhiyang QI, Michimasa INABA
Article type: SIG paper
Pages
07-11
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
While Large Language Models (LLMs) show significant potential in psychological counseling, rigorous, multi-dimensional evaluation of dialogue quality is paramount to ensure service reliability and professional accountability. This critical assessment is necessary to identify best practices, continuously improve LLM performance, and build user trust in automated generative mental health support. Addressing the challenge of conducting this complex evaluation effectively, we introduce a novel Explanation-Guided Score Prediction Framework leveraging KokoroChat, a large-scale Japanese counseling dialogue dataset. The proposed framework fundamentally enhances the evaluation of LLM-based counseling systems by integrating quantitative score prediction with interpretable, structured explanations. These LLM-generated rationales (comprising a "reason" and a "reflection") serve as auxiliary supervision signals during the training process, effectively aligning the model's predictions with the logic of human evaluative reasoning. This approach encourages the model to learn semantically rich representations of counseling dialogues.
View full abstract
-
Issei OKUDA, Michimasa INABA
Article type: SIG paper
Pages
12-15
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
Dialogue State Tracking (DST) plays a crucial role in task-oriented dialogue systems, but annotation errors within the training datasets can degrade model performance. As manual error correction is costly, this research proposes two frameworks for the automatic correction of DST datasets. The first approach uses an annotation error detection model to identify specific data points for targeted correction, while the second leverages a combination of an LLM and a DST model to perform corrections.An evaluation of the detection model, constructed by fine-tuning a pre-trained language model on the MultiWOZ dataset, revealed that its performance was insufficient. However, we discovered a tendency for the model to misclassify data with a large number of filled slots as errors.
View full abstract
-
Ryo FUKUDA, Takatomo KANO, Naohiro TAWARA, Marc DELCROIX, Atsunori OGA ...
Article type: SIG paper
Pages
16-21
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Alexi AYRTON, Tamon MIKAWA, Kengo OHTA, Ryota NISHIMURA, Norihide KITA ...
Article type: SIG paper
Pages
22-26
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Yusei SENGOKU, Takao OBI, Kotaro FUNAKOSHI
Article type: SIG paper
Pages
27-30
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Hyuga NAKAGURO, Koichiro YOSHINO
Article type: SIG paper
Pages
31-36
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
When the same linguistic content carries different acoustic nuances, particularly in terms of expressed emotions, the corresponding dialogue system response must align with the given nuance. However, existing SLMs such as Qwen2-Audio are not necessarily robust against such differences. In this work, we define a task that detects the consistency or inconsistency between the emotional label of an utterance and the system's response, and build a model to perform this prediction. We hypothesize that emotion labels are a control signal that modulates text interpretation, and we construct a prediction model based on Feature-wise Linear Modulation (FiLM).
View full abstract
-
Keisuke KAMEYAMA, Kazunori KOMATANI, Mikio NAKANO
Article type: SIG paper
Pages
37-42
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Shuya UCHIYAMA, Michimasa INABA
Article type: SIG paper
Pages
43-46
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
In response to the rising demand for mental-health care and the concomitant shortage of counselors, research into AI counseling using Large Language Models(LLMs) has been progressing. However, AI risks giving inappropriate responses , and when a client confides a serious problem, it is necessary to replace the AI with a human counselor. In such cases, it is extremely important to summarize the dialogue history so that the succeeding counselor can quickly and accurately understand the context of the dialogue. Nevertheless, in counseling handover situations, the optimal summary format for facilitating a smooth AI-to-human transition and eliciting prompt, high-quality responses has not been sufficiently investigated. In this study, we propose an "dialogue format summary with final turn" as a summary format that naturally includes the final utterances in the dialogue history and the surrounding context. To evaluate the effectiveness of the proposed summary format, we conducted a response-generation experiment in which both LLMs and human counselors produced responses under multiple summarization formats. By measuring the quality of responses and the handoff time, we identify the summarization format that best facilitates a smooth handoff.
View full abstract
-
Ryoma SUZUKI, Zhiyang QI, Michimasa INABA
Article type: SIG paper
Pages
47-51
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
High-quality, manually created counseling dialogue datasets for public research remain extremely limited. To address this resource gap, this study expands the available multilingual resources by translating KokoroChat, a large-scale Japanese counseling corpus, into both English and Chinese.However, translations generated by a single Large Language Model (LLM) often suffer from instability due to model-specific biases. To address this issue, we propose a new refinement method that integrates outputs from multiple LLMs. Specifically, our approach first generates translation hypotheses using three distinct LLMs for each target language. A single LLM then produces a single high-quality translation by integrating the strengths and compensating for the shortcomings of three hypotheses. Experimental results confirm that the translations produced by our proposed method are of higher quality than those from any single LLM.The new multilingual dataset constructed using this method, "Multilingual KokoroChat", will be made publicly available to support further research.
View full abstract
-
Mana FUKAMI, Michimasa INABA
Article type: SIG paper
Pages
52-54
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
The mainstream approach for Large Language Models (LLMs) in role-playing specific individuals or characters is to generate responses based on predefined personas. However, human personality continuously evolves throughout life. This study investigates whether LLMs can generate and transform their own personality by being endowed with life experiences. As a method to structure these life experiences, we adopt the "Life Graph," which plots episodes and their corresponding happiness levels at various ages and connects them with a curve. Our analysis uses the Big Five personality traits and a scale measuring values that measure whether people place more importance on moral values or personal benefits in everyday situations. Furthermore, we examine the effectiveness of various methods for presenting these life experiences and the Life Graph to the LLM.
View full abstract
-
Sousuke YONEYAMA, Hiroki MORI
Article type: SIG paper
Pages
55-57
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Takuma MIWA, Yusuke ODA, Hien OHNAKA, Seiya KAWANO, Koichiro YOSHINO
Article type: SIG paper
Pages
58-63
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
Cascade implementations of multiple machine learning models allow individual modules to be trained independently. However, they face the challenge that some information other than the final hypothesis output by the preceding module becomes partially lost at that stage. To address this issue, N-best training methods are generally employed, but they suffer from increased training and inference costs depending on N. This research proposes a framework for speech dialogue state tracking that addresses this issue. By inputting a vector composed of the probability values for each hypothesis output by the ASR model into a quantum machine learning model and simultaneously processing multiple hypotheses, it suppresses the increase in training and inference costs associated with conventional N-best training methods. We applied the proposed method to the DSTC2 dataset for speech dialogue tracking tasks and confirmed that it enables a significant reduction in the number of parameters while maintaining accuracy.
View full abstract
-
Takao OBI, Sadahiro YOSHIKAWA, Mao SAEKI, Masaki EGUCHI, Yoichi MATSUY ...
Article type: SIG paper
Pages
64-69
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
The development of large-scale spoken dialogue systems faces increasing manual costs for quality assurance and evaluation, making user emulators a promising approach. However, emulating interactional phenomena such as overlapping speech has been difficult with emulators using conventional spoken dialogue systems. Full-duplex spoken dialogue models, which enable simultaneous bidirectional dialogue, are promising foundations for user emulation. In this work, we develop the L2 Learner Emulator (L2LE) by fine-tuning a full-duplex spoken dialogue model on a large corpus of second-language (L2) learner interview dialogues, enabling proficiency-aware utterance generation. We further conduct interview dialogues between L2LE and InteLLA, a spoken dialogue system designed to elicit spontaneous learner speech, and analyze the resulting dialogues to assess the extent to which L2LE reproduces the dialogue features of real L2 learners.
View full abstract
-
Moe NAGAO, Koichiro TERAO, Saki SAWAI, Shingo AOYAMA, Naoto IWAHASHI
Article type: SIG paper
Pages
70-74
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Koichiro TERAO, Rikunari SAGARA, Naoto IWAHASHI
Article type: SIG paper
Pages
75-79
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Tatsuya YAMAGUCHI, Kazuhiro SHIDARA, Takahiro YOSHIOKA, Masaki ISHIHAR ...
Article type: SIG paper
Pages
80-85
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Mikio NAKANO, Kazunori KOMATANI
Article type: SIG paper
Pages
86-87
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Yuta UEHARA, Takayoshi , TSUJITA
Article type: SIG paper
Pages
88
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Yuka KOBAYASHI, Yuya SHIRAKI, Yingying LAO, Tsuyoshi KUSHIMA, Shunsuke ...
Article type: SIG paper
Pages
89-90
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Shinya ISHIKAWA
Article type: SIG paper
Pages
91
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
UGO Co., Ltd. develops the robot "ugo," which supports not only security and inspection tasks but also customer service and guidance activities. To date, we have equipped the robot with a multilingual dialogue system powered by large language models (LLMs), combining speech recognition, response generation, retrieval-augmented question answering, and conversation summarization to enhance service quality. Building upon this, in the current year we introduced a new function that integrates with digital signage to display supplementary information and product images. By presenting related content on signage according to user queries, the system aims to improve customer experience. Since April 2025, this signage-integrated system has been deployed in commercial facilities, public institutions, and international events such as Expo 2025 Osaka, conducting over 5000 dialogue sessions. In this presentation, we will discuss the challenges identified through real-world operation and explore future possibilities for application.
View full abstract
-
Jun SAKANO, Kango YOSHIOKA, Keishiro KATAOKA, Takahumi YAMAZOE, Yoko T ...
Article type: SIG paper
Pages
92-93
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Takamasa ARAKI
Article type: SIG paper
Pages
94
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Kensho OGURI, Ken SUGIMORI, Takashi MIKAMI
Article type: SIG paper
Pages
95-96
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Kouki MIYAZAWA, Yoshinao SATO
Article type: SIG paper
Pages
97-98
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
We introduce a machine learning model that predicts when a spoken dialogue system is allowed to take a turn, i.e., transition relevance points (TRPs), to enable machines to smoothly communicate with humans. Most conventional spoken dialogue systems decide when to start speaking based on the elapsed time after the user's speech. However, such a system often interrupts user speech at inappropriate times or starts speaking too late. Our model predicts TRP scores using acoustic features of user speech at regular time intervals. The effectiveness of using the proposed model in determining the timing of system speech is demonstrated.
View full abstract
-
Seijiro MATSUBARA, Hiyori TAKEBE
Article type: SIG paper
Pages
99-102
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Ryosuke ITO, Tetsuya TAKIGUCHI, Mitsuhiro HIRATA, Yumiko MORI, Satoko ...
Article type: SIG paper
Pages
103-108
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Kenta YAMAMOTO, Yuki HORIGUCHI, Kazunori KOMATANI
Article type: SIG paper
Pages
109-114
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Ekai HASHIMOTO, Kohei NAGIRA, Takeshi MIZUMOTO, Shun SHIRAMATSU
Article type: SIG paper
Pages
115-119
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Masaki YAMADA, Hiroki MORI
Article type: SIG paper
Pages
120-124
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Fuminori NAGASAWA, Ekai HASHIMOTO, Shun SHIRAMATSU
Article type: SIG paper
Pages
125-129
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
To elicit user requirements and issues through dialogue and provide appropriate functions and support, it is crucial not only to encourage self-disclosure to uncover users' true intentions but also to accurately gather the information necessary for the system to deliver its services. To realize such dialogue, we propose an interview dialogue system that manages topics using a question tree. In the question tree, questions are arranged on a tree-structured graph based on topic parent-child relationships, and the system manages the deepening/changing of question topics through graph traversal. This study investigates a method for automatically generating such a question tree to reach necessary questions several steps ahead in topic development, serving as a technique that balances self-disclosure and information gathering. We evaluated the effectiveness of the proposed method by prototyping a chatbot incorporating an LLM-based question tree generation mechanism and a question generation mechanism.
View full abstract
-
Yusuke IKEMI, Muhammadyeza BAIHAQI, Canasai KRUENGKRAI, Yutaka NAKAMUR ...
Article type: SIG paper
Pages
130-131
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
In communication, gestures play an important role in aiding the understanding of speech and conveying social information. In particular, human gestures are consistent with individual personality traits, such as extraversion and introversion, and gestures that consider these personality traits are necessary to send appropriate social signals. Therefore, this research proposes a method for generating co-speech gestures that reflect a robot's personality traits (e.g., extraversion/introversion) by utilizing a large language model (LLM).
View full abstract
-
Hinako KIZAWA, Yoshiko ARIMOTO, Kazuo OKANOYA
Article type: SIG paper
Pages
132-136
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Shoki SUZUKI, Hiroki , MORI
Article type: SIG paper
Pages
137-140
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Issei SUZUKI, Michimasa INABA
Article type: SIG paper
Pages
141-143
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
Recent advances in Large Language Models (LLMs) have made it possible to generate natural and diverse commentary in board games such as chess and shogi. However, existing commentary systems for these games often produce mechanically phrased explanations, lacking emotional richness and the sense of companionship that arises when playing with a friend.In this study, we focus on shogi and aim to develop a shogi dialogue system that enables users to engage in more natural, human-like interactions while playing. The proposed system is designed to provide a graphical interface, extract multifaceted features from game states (such as SFEN representation, legal moves, engine evaluations with depth-dependent variations, reading lines, and piece influence), and combine them with a commentary dataset constructed from game records for generating commentary responses. By fine-tuning LLMs and designing prompts that incorporate uncertainty, surprise, and emotional expressions, the system seeks to generate responses that are not only analytical but also emotionally engaging. We evaluate whether such responses enhance entertainment value and user engagement compared to conventional commentary systems.
View full abstract
-
Tetsuro TAKAHASHI, Hirofumi KIKUCHI, Jie YANG, Hiroyuki NISHIKAWA, Mas ...
Article type: SIG paper
Pages
144-149
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
In human-human conversation, interpersonal consideration for the interlocutor is essential, and similar expectations are increasingly placed on dialogue systems. This study examines the behavior of dialogue systems in a specific interpersonal scenario where a user vents frustrations and seeks emotional support from a long-time friend represented by a dialogue system. We conducted a human evaluation and qualitative analysis of 15 dialogue systems under this setting. These systems implemented diverse strategies, such as structuring dialogue into distinct phases, modeling interpersonal relationships, and incorporating cognitive behavioral therapy techniques. Our analysis reveals that these approaches contributed to improved perceived empathy, coherence, and appropriateness, highlighting the importance of design choices in socially sensitive dialogue.
View full abstract
-
Chenyu HU, Takuto ASAKURA, Koichiro YOSHINO
Article type: SIG paper
Pages
150-151
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
In collaborative problem-solving, particularly in technical domains like mathematics, discussions often combine spoken dialogue with a shared visual space, such as a whiteboard. A critical challenge for comprehending these interactions is resolving the reference between ambiguous expressions in dialogue (e.g., pronouns) and the specific symbols or equations written on the board.To address this, and drawing inspiration from research in Visually-Grounded Dialogue, we propose a new annotation schema for capturing the discourse structure of these multimodal discussions by explicitly linking dialogue utterances to their corresponding element on the whiteboard.
View full abstract
-
Kai YOSHIDA, Koichiro YOSHINO
Article type: SIG paper
Pages
152-157
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
In target-guided conversation, it is crucial to improve the user experience by leading the conversation toward the system's own goal without making the user feel guided and without making them aware of the system's objective.In this study, we propose SBIS-TGC (Surprisal-Based Induction Score for Target-Guided Conversation), an automatic evaluation metric designed to assess the degree of induction in system utterances, with the objective of achieving conversation goals without the user noticing either the system's goal or its guiding behavior.SBIS-TGC quantifies the sense of induction by calculating surprisal between utterances using an external language model.Through dialogue experiments employing a system that selects utterances based on SBIS-TGC, we demonstrate that the proposed method can reduce the perceived induction in target-guided dialogue and enable conversations where users remain unaware of the system's intended target.
View full abstract
-
Yota KIMURA, Shun SHIRAMATSU, Fuminori NAGASAWA
Article type: SIG paper
Pages
158-162
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
When humans and robots build relationships through dialogue, the way user privacy is managed becomes an important issue. In particular, whether a robot can appropriately decide what user information to share with others is closely related to rapport formation. To examine this, we designed conversational robots with different degrees of disclosure of others' private information and investigated their impact on user experience. In this study, we developed two types of robots: an "intrusive type" that pays no attention to privacy and readily shares information, and a "considerate type" that withholds deeply personal topics. Using these two robots, we compare their effects on users' trust in the robot and willingness to self-disclose, in order to clarify the relationship between rapport formation and the handling of privacy.
View full abstract
-
Shiki SATO, Shinji IWATA, Asahi HENTONA, Yuta SASAKI, Takato YAMAZAKI, ...
Article type: SIG paper
Pages
163-168
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Keiko OCHI, Divesh LALA, Koji INOUE, Tatsuya KAWAHARA, Hirokazu KUMAZA ...
Article type: SIG paper
Pages
169-173
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS
-
Koji INOUE, Mikey ELMERS, Yahui FU, Zihaur PANG, Taiga MORI, Divesh LA ...
Article type: SIG paper
Pages
174-179
Published: October 27, 2025
Released on J-STAGE: October 27, 2025
CONFERENCE PROCEEDINGS
RESTRICTED ACCESS