Host: The Institute of Image Electronics Engineers of Japan
Name : Proceedings of the 49th Annual Conference of the Institute of Image Electronics Engineers of Japan 2021
Number : 49
Location : [in Japanese]
Date : June 24, 2021 - June 26, 2021
In recent years, with the development of paperless documents and optical character recognition (OCR), it is important to binarize document images converted into digital data. However, when binarizing a document, if the document is deteriorated due to stains such as stains or strike-through, or if the document is blurred or missing, over-extraction or area loss will occur. Good binary results are not obtained. In this study, we proposed and quantitatively considered a binarization method to improve over-extraction and domain defects, with the main purpose of verifying the morphology operations used in these degraded document images. In the binarization experiment, the effectiveness was confirmed from the matching rate and recall rate, the F value obtained from the matching rate and recall rate, and the evaluation by RMSE for the image data set provided by DIBCO. In addition, it was confirmed that the threshold value of Kenny Edge used in the above-mentioned method is an important factor.