Abstract
When extracting IS-A relationship between nouns from texts, the majority of conventional methods rely on specific expression patterns such as ``A such as B''. However, those expressions cover only a restricted subset of nouns comparing to the entire set contained in the corpus. Based on the observation, this paper investigates a method for identifying IS-A relationship that does not depend on particular expression patterns. In the paper, we first clarify our notion of ``IS-A relationship'' as a subset relation between the instance sets represented by two distinct nouns. This follows the assumption that hyponyms that appear in texts are most likely substituted by their hypernyms while the substitutionality does not hold in the opposite cases. Based on this, we propose a new method for detecting possible hypernym-hyponym pairs by examining the substitutionality of the two nouns using their co-occurring verbs and their dependency/case structure. Then, the effectiveness of the proposed method is confirmed through some experimental study. In the experiments, 47 target words were first selected from the Word List by Semantic Principles. Next, using 11-years newspaper articles as input texts, a list of candidate hyponyms was generated for each selected word using the proposed method. Also, a conventional pattern-based method was applied to the same newspaper articles to obtain hypernym-hyponym candidate pairs. Then, the candidate lists were examined by a human reviewer. Through the experiments, it is confirmed that 94.3% pairs obtained by our method were not covered by the conventional method. Also, the average accuracy was about 36% for the top 200 ranked candidates of the proposed method, the performance of which was quite comparable to the 31% of the existing method.