Journal of Information Processing and Management
Online ISSN : 1347-1597
Print ISSN : 0021-7298
ISSN-L : 0021-7298
Article
Labor saving for reprinting Japanese rare classical books : The development of the new method for OCR technology including kana and kanji characters in cursive style
Sumiko YAMAMOTOTomejiro OSAWA
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2016 Volume 58 Issue 11 Pages 819-827

Details
Abstract
Most modern Japanese people can't read Japanese rare classical books written in kana and kanji characters in cursive style, and felt it more difficult to understand contents of a large quantity of existing them. Therefore we developed a new method OCR for the purpose of the labor saving for a heavy reprint load, and demonstrated that it is possible to make the automatic text data having more than 80% precision under a constant condition as a result of principle validation tests for their books including kana and kanji characters in cursive style. In the new method OCR, character images were extracted with position information and a ideographic variation database was constructed, from which the character codes of the rare classical books for reprinting are identified by the similar kanji retrieval method. In addition, we make an effort to reduce loads to reprint generally by the working process design combined automatic processing with manpower without the full automation. We report the structure of the new method OCR and the present reprint situation using this.
Content from these authors
© 2016 Japan Science and Technology Agency
Previous article Next article
feedback
Top