対照学習による口調の類似性評価のための文ベクトルの獲得

銭本 友樹; 古俣 槙山; 宇津呂 武仁

doi:10.11517/pjsai.JSAI2023.0_4A2GS605

Abstract

Since dialogue systems are required to keep its speech style consistency, evaluating the similarity of speech styles is an important task. However, the Japanese language has a wide variety of speech styles, and the vocabulary and word usage characteristics of each speech style is vast, making it difficult to evaluate the speech style. Therefore, we propose a speech style embedding model that generates a style-sensitive vector. The speech style embedding model is constructed by fine-tuning a pre-trained BERT model using contrastive learning. Sentence pairs with similar and different speech styles, which are necessary for contrastive learning, are automatically collected on a large scale using a sequence of sentences in web novels. We also analyze the grouping of speech styles and the characteristic vocabulary and word usage of each speech style using Ward hierarchical clustering method. Finally, we focus on the variation in the speech style of the same person depending on the situation, and analyze the variation in the style-sensitive vectors of the same character in the novel.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!