IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Well-Balanced Successive Simple-9 for Inverted Lists Compression
Kun JIANGYuexiang YANGQinghua ZHENG
Author information
JOURNAL FREE ACCESS

2017 Volume E100.D Issue 7 Pages 1416-1424

Details
Abstract

The growth in the amount of information available on the Internet and thousands of user queries per second brings huge challenges to the index update and query processing of search engines. Index compression is partially responsible for the current performance achievements of existing search engines. The selection of the index compression algorithms must weigh three factors, i.e., compression ratio, compression speed and decompression speed. In this paper, we study the well-known Simple-9 compression, in which exist many branch operations, table lookup and data transfer operations when processing each 32-bit machine word. To enhance the compression and decompression performance of Simple-9 algorithm, we propose a successive storage structure and processing metric to compress two successive Simple-9 encoded sequence of integers in a single data processing procedure, thus the name Successive Simple-9 (SSimple-9). In essence, the algorithm shortens the process of branch operations, table lookup and data transfer operations when compressing the integer sequence. More precisely, we initially present the data storage format and mask table of SSimple-9 algorithm. Then, for each mode in the mask table, we design and hard-code the main steps of the compression and decompression processes. Finally, analysis and comparison on the experimental results of the simulation and TREC datasets show the compression and decompression efficiency speedup of the proposed SSimple-9 algorithm.

Content from these authors
© 2017 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top