IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
A Non-Modal Type of Shift-JIS Text Compression by Using A Dictionary Array
ITOH MemberTaiji SATOH
Author information
JOURNAL FREE ACCESS

2000 Volume 120 Issue 1 Pages 14-19

Details
Abstract

This paper proposes a new data compression method for a Japanese-text file, where the text is written in shift-JIS (JIS X 0208) codes. In the first pass, a dictionary array is built up by the higher frequency of both single and double byte characters. In the second pass, all the registered characters are replaced with the dictionary items: the code OxFF is put into a compressed file in front of non-registered ASCII character so as to distinguish non-registered characters from registered ones. It takes O (1) time on a hashing basis to confirm whether each input character belongs to the dictionary, and to transfer its code to a dictionary item. Furthermore, the run-length encoding is applied to a sequence of consecutive identical characters for the purpose of accomplishment of the much higher compression ratio. The code OxFE is a indicator to start this encoding. A feature of the method is to be a non-modal type of compression.

Content from these authors
© The Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top