HTML文書からの単語間の上位下位関係の自動獲得

新里 圭司; 鳥澤 健太郎

doi:10.5715/jnlp.12.125

Abstract

This paper describes an automatic acquisition method for hyponymy relations.Hyponymy relations play a crucial role in various natural language processing systems, and there have been many attempts to automatically acquire the relations from largescale corpora.Most of the existing acquisition methods rely on particular linguistic patterns, such as juxtapositions, which specify hyponymy relations.Our method, however, does not use such linguistic patterns.We try to acquire hyponymy relations from four different types of clues.The first is repetitions of HTML tags found in usual HTML documents on the WWW.The second is statistical measures such as df and idf, which are popular in IR literatures.The third is verb-noun cooccurrences found in normal corpora.The fourth is heuristic rules obtained through our experiments on a development set.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!