Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Particle Error Correction of Japanese Learners from Small Error Data
Kenji ImamuraKuniko SaitoKugatsu SadamitsuHitoshi Nishikawa
Author information
JOURNAL FREE ACCESS

2012 Volume 19 Issue 5 Pages 381-400

Details
Abstract
This paper presents grammatical error correction of Japanese particles written by foreign Japanese learners. Our method is based on discriminative sequence conversion, which corrects particle errors by substitution, insertion, or deletion. For this kind of error correction task, it is difficult to collect large learners’ corpora. We attempt to solve this problem based on a discriminative learning framework which uses the following two methods. First, language model probabilities obtained from large Japanese corpora are combined with n-gram binary features obtained from the learners’ corpora. This method is applied in order to measure the correctness of Japanese sentences. Second, automatically generated pseudo-error sentences are added to the learners’ corpora in order to enrich the corpora directly. Furthermore, we apply domain adaptation, in which the pseudo-error sentences (the source domain) are adapted to the real-error sentences (the target domain). Experimental results show that the recall rate has been improved by using both the language model probabilities and the n-gram binary features. Stable improvement has been achieved by using pseudo-error sentences with the domain adaptation.
Content from these authors
© 2012 The Association for Natural Language Processing
Previous article Next article
feedback
Top