電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<ソフトウェア・情報処理>
Ngram型全文検索システムにおけるインデクス長の実験的検討
山本 裕辻 洋
著者情報
ジャーナル フリー

2006 年 126 巻 9 号 p. 1173-1180

詳細
抄録

N-gram indexing method is the most popular algorithm for the Japanese full text search system where each index consists of serial N characters. Especially the full text search for Japanese text usually has the 2-gram characters index as base in order to save the volumes of the index file. Although the additional higher-gram index is expected to improve the performance for searching indices, we have no experimental evaluation with additional higher-gram indices. This paper presents the evaluation about improving the text search performance with additional higher-gram indices by Search Term Intensive Approach which decides the term for higher-gram indices depend upon the appearance ratio in application programs as the searching term. On the concrete evaluation, the number of paper articles for searching is one or two hundred thousands, and the simulation for 5 or more gram additional indices can be applied add to evaluation for 3,4-gram additional indices.

著者関連情報
© 電気学会 2006
前の記事 次の記事
feedback
Top