人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
原著論文
Wikipediaを教師データに用いた要約文書収集クエリパターンの学習
田中 翔平岡崎 直観石塚 満
著者情報
ジャーナル フリー

2011 年 26 巻 2 号 p. 366-375

詳細
抄録

This paper presents a novel method for acquiring a set of query patterns that are able to retrieve documents containing important information about an entity. Given an existing Wikipedia category that should contain the entity, we first extract all entities that are the subjects of the articles in the category. From these articles, we extract triplets of the form (subject-entity, query pattern, concept) that are expected to be in the search results of the query patterns. We then select a small set of query patterns so that when formulating search queries with these patterns, the overall precision and coverage of the returned information from the Web are optimized. We model this optimization problem as a Weighted Maximum Satisfiability (Weighted Max-SAT) problem. Experimental results demonstrate that the proposed method outperformed the methods based on statistical measures such as frequency and point-wise mutual information (PMI) being widely used in relation extraction.

著者関連情報
© 2011 JSAI (The Japanese Society for Artificial Intelligence)
前の記事 次の記事
feedback
Top