主催: 人工知能学会
会議名: 第72回 言語・音声理解と対話処理研究会
回次: 72
開催地: 東京工業大学すずかけ台キャンパス 中会議室およびG3棟1Fエントランス
開催日: 2014/12/15 - 2014/12/16
p. 06-
The evaluation measures for chat-oriented dialogue systems are required in order to effectively improve such systems. Some studies have evaluated systems with several arbitrarily de ned measures; however, it is not examined whether their measures are appropriate. We analyze evaluation measures for chat-oriented dialogue systems through the semantic differential. Our analysis shows that evaluation measures are clustered into four factors for each evaluator. The factors consist of two common factors, one resemble factor between evaluators, and one personal factor. We also develop an automatic evaluation system that estimates each evaluation measure defined in the semantic differential. Our experiment shows that the developed system estimates most of the scores with the similar correlation coefficients as between human evaluators.