Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
Construction of an Error-Tagged Evaluation Corpus for Japanese Grammatical Error Correction
Aomi KoyamaTomoshige KiyunaKenji KobayashiMio AraiMasato MitaTeruaki OkaMamoru Komachi
Author information
JOURNAL FREE ACCESS

2023 Volume 30 Issue 2 Pages 330-371

Details
Abstract

This study constructed an error-tagged evaluation corpus for Japanese grammatical error correction (GEC). Evaluation corpora are essential for assessing the performance of models. The availability of various evaluation corpora for English GEC has facilitated a comprehensive comparison between models and the development of the English GEC community. However, the development of the Japanese GEC community has been hindered due to the lack of available evaluation corpora in the Japanese GEC. As a result, we constructed a new evaluation corpus for the Japanese GEC and made it available to the public. We used texts written by the Japanese language learners in the Lang-8 corpus, a representative learner corpus in GEC, to create the evaluation corpus. The specification of the evaluation corpus was modified to align with the representative corpora and tools in the English GEC, making it easy for GEC researchers and developers to use the evaluation corpus. Finally, we evaluated representative GEC models on the created evaluation corpus and reported baseline scores for future Japanese GEC.

Content from these authors
© 2023 The Association for Natural Language Processing
Previous article Next article
feedback
Top