2019 Volume 31 Issue 5 Pages 797-807
Identifying antecedents in anaphoric relationships is considered to be a necessary elemental technique for achieving high accuracy in natural language processing such as robot dialogue and question answering. We developed AnasysD with a more accurate anaphoric analysis system by conducting anaphoric analysis of indicative pronoun on the basis of the word meaning similarity using semantic analysis system SAGE. In order to quantify the antecedent likelihood, we set 12 kinds of features including co-occurrence similarity between the antecedent clause and the receiver clause of anaphor and another likelihood over the 2-dimensional features such as the upper concept classification of the anaphor and the deep case of the antecedent. We use the NAIST text corpus to learn the probability distribution of the correct antecedent rate by the naive Bayes method and decide the clause with the highest likelihood to be the correct antecedent. As a result of evaluating by 5-fold cross-validation, we achieved an accuracy of 63.42%.