2006 年 126 巻 9 号 p. 1173-1180
N-gram indexing method is the most popular algorithm for the Japanese full text search system where each index consists of serial N characters. Especially the full text search for Japanese text usually has the 2-gram characters index as base in order to save the volumes of the index file. Although the additional higher-gram index is expected to improve the performance for searching indices, we have no experimental evaluation with additional higher-gram indices. This paper presents the evaluation about improving the text search performance with additional higher-gram indices by Search Term Intensive Approach which decides the term for higher-gram indices depend upon the appearance ratio in application programs as the searching term. On the concrete evaluation, the number of paper articles for searching is one or two hundred thousands, and the simulation for 5 or more gram additional indices can be applied add to evaluation for 3,4-gram additional indices.
J-STAGEがリニューアルされました! https://www.jstage.jst.go.jp/browse/-char/ja/