In this study, to promote the translation and digitization of historical documents, we attempted to recognize Japanese classical kuzushiji characters by using the dataset released by the Center for Open Data in the Humanities (CODH). Using deep learning, which has undergone remarkable development in the field of image classification, we analyzed how successfully deep learning could classify more than 1,500-class kuzushiji characters through experiments, and what made it difficult to classify kuzushiji characters. In addition, we introduced a method to automatically eliminate characters that were difficult to classify or characters that were not used during training. In an actual translation work, those ambiguous characters can be left as unknown characters as “ 〓 (geta)” and pass on to the expert to make a decision. Finally, our experiments showed that the classification rate was improved from 72.10% to over 90% by performing the data augmentation and classifying only characters with high confidence and confirmed the effectiveness of our proposed method.
抄録全体を表示