Pages 7-14
This paper describes a system of OCR, which can handle scientific documents containing mathematical expressions. After the extraction of the lines from a scanned image, we segment each line into Japanese parts and mathematical parts using candidates and their scores produced by character recognition. In the analysis of the structure of the mathematical parts, we apply the top-down recognition algorithm for mathematical expressions. In this algorithm, the normalized sizes of letters and symbols play an important role. Our algorithms works reliably on almost noiseless images. After the image-to-TeX conversion in this way, the braille document is produced with our LaTeX-to-braille translation system.