IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Handwritten Character Image Generation for Effective Data Augmentation
Chee Siang LEOWTomoki KITAGAWAHideaki YAJIMAHiromitsu NISHIZAKI
Author information
JOURNAL FREE ACCESS Advance online publication

Article ID: 2024EDP7201

Details
Abstract

This study introduces data augmentation techniques to enhance training datasets for a Japanese handwritten character classification model, addressing the high cost of collecting extensive handwritten character data. A novel method is proposed to automatically generate a largescale dataset of handwritten characters from a smaller dataset, utilizing a style transformation approach, particularly Adaptive Instance Normalization (AdaIN).Additionally, the study presents an innovative technique to improve character structural information by integrating features from the Contrastive Language-Image Pre-training (CLIP) text encoder. This approach enables the creation of diverse handwritten character images, including Kanji, by merging content and style elements. The effectiveness of our approach is demonstrated by evaluating a handwritten character classification model using an expanded dataset, which includes Japanese hiragana, katakana, and Kanji from the ETL Character Database. The character classification model's macro F1 score improved from 0.9733 with the original dataset to 0.9861 using the augmented dataset by the proposed approach. This result indicated that our proposed character generation model was able to generate new character images that were not included in the original dataset and that they effectively contributed to training the handwritten character classification model.

Content from these authors
© 2025 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top