Abstract
In this paper, we propose a method of character extraction for scene images based on identification of a local target area and adaptive thresholding. The proposed extraction method is performed as follows: A scene image is resolved into a lightness image and a saturation image using a HSL transform. Vertical and horizontal edge images are made from these two images, and these edge images are binarized and thinned. The corresponding prominent features in the saturation and lightness images are detected using the Hough transform. A region between straight vertical lines is then extracted as a signboard region candidate in reference to the edge histogram. The extracted signboard region candidate is binarized using a threshold value determined by adaptive thresholding for each character region in the signboard region. The binary image containing extracted characters is then analyzed and the linear region containing the most character strings is identified as the character string region. This technique was applied to 100 scene images in order to verify the reliability of character extraction. Of the 450 characters in all the images, 438 were extracted correctly, representing a 97.3% successful recognition rate. Correct character strings were extracted in 98 of the 100 strings examined.