人工知能
Online ISSN : 2435-8614
Print ISSN : 2188-2266
人工知能学会誌(1986~2013, Print ISSN:0912-8085)
キーワード適合度の最適化によるキーワード抽出
有田 正剛西村 健士島津 秀雄
著者情報
解説誌・一般情報誌 フリー

1995 年 10 巻 4 号 p. 551-556

詳細
抄録

Nowadays there are very many document databases. However, people often find it difficult to retrieve target documents from those databases. One reason for the difficulty is that keywords assigned to documents are not adequate. This paper presents a novel method for automatic keyword extraction from Japanese documents in a database. Conventionally, keywords have been extracted, based on various heuristics, with which the importance of individual words is measured. This paper proposes objective criteria for extracting keywords from a mass of candidate-words. They are efficiency criterion and recall criterion. The efficiency criterion concerns the efficiency involved in utilizing a word for retrieving a document from a database. The recall criterion for a word concerns the likelihood that that word is used as a keyword for database retrieval. Those two criteria are quantified statistically using distribution pattern of documents in a database. A product of the quantified criteria supplies a keyword-fitness measure for a word. Keyword extraction is implemented as an optimization of the keyword-fitness by Genetic Algorithm. An experimental result shows the validness of the keyword-fitness and suggets the complementarity of the authors' keyword-fitness and heuristics, when conventionally used.

著者関連情報
© 1995 人工知能学会
前の記事 次の記事
feedback
Top