Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper
A New Method to Find Zero Pronouns Referring to Entry Words and Estimate their Surface Cases in the Context of a World History Glossary and its Descriptions
Kosuke OyaKotaro SakamotoHideyuki ShibukiTatsunori Mori
Author information
JOURNAL FREE ACCESS

2020 Volume 27 Issue 1 Pages 31-63

Details
Abstract

In this paper, we focus on the usage of a world history glossary as one of the knowledge sources for automated answer generation of essay-type questions. The questions we use were derived the University of Tokyo’s entrance examinations on world history, and the answer generation uses a multi-document summarization methodology. In the automated answer generation, the glossary’s descriptions we used as part of the answers. However, entry words were often omitted from the descriptions. To make complete sentences from entry words and their description, we propose a method to find zero pronouns referring to entry words inside the descriptions and estimate their surface cases. This paper’s task differs from conventional zero anaphora resolution in the following two ways. First, with this method the entry word is the only candidate for the antecedent, as opposed to having to select one zero pronoun among several zero pronouns. Second, context information of the antecedent, which may be a useful clue in anaphora resolution, does not exist for the entry words, because the entry word appears alone and is not embedded in a sentence. Evaluation results based on a world history glossary revealed that the proposed method would be more effective than the existing method using zero anaphora analysis with Kurohashi-Nagao Parser. Furthermore, we attempted a method to generate pseudo training data from ordinary sentences in a textbook because we observed low accuracy when the entry word was embedded with low-frequency surface cases. Additionally, the results demonstrated that the introduction of the data improves the estimation of “o”-case and “ni”-case in terms of F-measure.

Content from these authors
© 2020 The Association for Natural Language Processing
Previous article Next article
feedback
Top