2025 Volume 32 Issue 4 Pages 1241-1271
Personalization has garnered attention in the field of dialogue response generation, with the aim of creating responses tailored to individual user preferences by leveraging background information and dialogue history. Previous studies suggest that, in addition to adapting the content of the response, a system's use of a speaking style similar to that of the user can be a factor in increasing user affinity. For personalization that adapts to subjective preferences, discussions on stylistic similarity should be based on user evaluations. However, numerous evaluations of stylistic similarity rely on objective assessments by third parties who are not participants in the dialogue. The distinction between these and subjective evaluations based on user perception has not been sufficiently examined. In this study, we focused on non-task-oriented dialogue settings and constructed a new dataset in both English and Japanese, annotated with manual evaluations of subjective and objective stylistic similarity, along with user dialogue preferences. Our analysis revealed that stylistic similarity as perceived by the user exhibited a high positive correlation with dialogue preference, whereas no clear correlation was observed with objective stylistic similarity. This study provides empirical evidence for the necessity of distinguishing between evaluation subjects in style assessments for personalization.