In this paper, we developed a reliable set of utterance intention tags and constructed an interview dialogue corpus annotated with these tags. The tagging scheme consists of two hierarchical levels. The first level includes eight broad categories, such as “question,” “answer,” “backchannel,” and “impression.” The second level provides a more fine-grained classification with fourteen tags, including “question-preparation,” “question-opinion,” and “answer-self-disclosure (opinion).” This hierarchical structure was designed to analyze the interviewer’s technique and then to apply these findings into the interview dialogue system. Each dialogue in the corpus was annotated by three to five annotators using the designed tags. It is observed that inter-annotator agreement was evaluated and found to be high, which indicates the strong reliability of tagging on the utterance. Further, we also investigated the cases of annotator disagreement. Specifically, we examined the most common cases where annotators disagreed on tags assigned to interviewer and guest utterances. Our analysis revealed that certain interviewer utterances were deliberately ambiguous, causing difficulty for annotators in assigning definitive tags.
View full abstract