Wikipedia の記事構造からの上位下位関係抽出

隅田 飛鳥; 吉永 直樹; 鳥澤 健太郎

doi:10.5715/jnlp.16.3_3

Paper

Hyponymy Relation Acquisition from Hierarchical Layouts in Wikipedia

Asuka Sumida, Naoki Yoshinaga, Kentaro Torisawa

Author information

Keywords: hyponymy (is-a) relation, Wikipedia, SVM, semi-structured information, hierarchical layouts

JOURNAL FREE ACCESS

2009 Volume 16 Issue 3 Pages 3_3-3_24

DOI https://doi.org/10.5715/jnlp.16.3_3

Details

Abstract

This paper describes a method of extracting a large set of hyponymy relations with a high precision from hierarchical layouts in Wikipedia articles. Hyponymy relation has been studied as one of the principal knowledge for information retrieval and web directory, which helps users to access the growing web. Various methods have been proposed to automatically acquire hyponymy relations. In this article, we first extract hyponymy relation candidates from sections and itemizations in hierarchical layouts of Wikipedia articles, and then filter out irrelevant candidates by using a machine learning technique. In experiments, we successfully extracted more than 1.35 million relations from the hierarchical layouts in the Japanese version of Wikipedia, with a precision of 90%.

Corresponding author

Register with J-STAGE for free!