情報知識学会研究報告会講演論文集
Online ISSN : 2432-9908
ISSN-L : 2432-9908
情報知識学会 第8回(2000年度)研究報告会講演論文集
会議情報

日本語テキストに対する統計的検索手法の性能比較 -テストコレクションによる実証-
*岸田 和明
著者情報
会議録・要旨集 フリー

p. 61-64

詳細
抄録
The paper reports some findings from an empirical study on comparison of retrieval performance between some statistical methods : vector space and probabilistic models. A large Japanese text test collection provided by the NACSIS was used, which consists of about 330,000 records of scientific proceedings. Each statistical method was testified using three kinds of indexing techniques for Japanese text : (1) longest matching against entries in a dictionary, (2) tokenizing by change of kind of characters, (3) a simple bi-gram method. Almost no statistically significant difference among the methods was observed, but it seems that probabilistic method based on logistic regression model indicates relatively better performance than other methods.
著者関連情報
© 2000 情報知識学会
前の記事 次の記事
feedback
Top