IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Data Engineering and Information Management
Efficient Top-k Document Retrieval for Long Queries Using Term-Document Binary Matrix — Pursuit of Enhanced Informational Search on the Web —
Etsuro FUJITAKeizo OYAMA
Author information
JOURNAL FREE ACCESS

2013 Volume E96.D Issue 5 Pages 1016-1028

Details
Abstract

With the successful adoption of link analysis techniques such as PageRank and web spam filtering, current web search engines well support “navigational search”. However, due to the use of a simple conjunctive Boolean filter in addition to the inappropriateness of user queries, such an engine does not necessarily well support “informational search”. Informational search would be better handled by a web search engine using an informational retrieval model combined with enhancement techniques such as query expansion and relevance feedback. Moreover, the realization of such an engine requires a method to prosess the model efficiently. In this paper we propose a novel extension of an existing top-k query processing technique to improve search efficiency. We add to it the technique utilizing a simple data structure called a “term-document binary matrix,” resulting in more efficient evaluation of top-k queries even when the queries have been expanded. We show on the basis of experimental evaluation using the TREC GOV2 data set and expanded versions of the evaluation queries attached to this data set that the proposed method can speed up evaluation considerably compared with existing techniques especially when the number of query terms gets larger.

Content from these authors
© 2013 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top