Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
Dialogue Evaluation Is Affected by How Raters Perceive Stylistic Similarity
Ikumi NumayaShoji MoriyaShiki SatoReina AkamaJun Suzuki
Author information
JOURNAL FREE ACCESS

2025 Volume 32 Issue 4 Pages 1241-1271

Details
Abstract

Personalization has garnered attention in the field of dialogue response generation, with the aim of creating responses tailored to individual user preferences by leveraging background information and dialogue history. Previous studies suggest that, in addition to adapting the content of the response, a system's use of a speaking style similar to that of the user can be a factor in increasing user affinity. For personalization that adapts to subjective preferences, discussions on stylistic similarity should be based on user evaluations. However, numerous evaluations of stylistic similarity rely on objective assessments by third parties who are not participants in the dialogue. The distinction between these and subjective evaluations based on user perception has not been sufficiently examined. In this study, we focused on non-task-oriented dialogue settings and constructed a new dataset in both English and Japanese, annotated with manual evaluations of subjective and objective stylistic similarity, along with user dialogue preferences. Our analysis revealed that stylistic similarity as perceived by the user exhibited a high positive correlation with dialogue preference, whereas no clear correlation was observed with objective stylistic similarity. This study provides empirical evidence for the necessity of distinguishing between evaluation subjects in style assessments for personalization.

Content from these authors
© 2025 The Association for Natural Language Processing
Previous article Next article
feedback
Top