Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Development and Applications of an English Learner Corpus with Multiple Information Tags
Keiji YasudaKeisuke KitamuraSeiichi YamamotoMasuzo Yanagida
Author information
JOURNAL FREE ACCESS

2009 Volume 16 Issue 4 Pages 4_47-4_63

Details
Abstract
Introduced in this paper is an English learner corpus built for the R & D of an e-Learning system. Analysis and application experiments of the corpus are also shown. The corpus consists of English sentences that were translated from Japanese by Japanese English learners. Each of them translated 300 Japanese sentences into English. Their English proficiencies were measured through TOEIC. Reference sentences, translated by bilinguals, were also collected for automatic evaluation of the translation quality. In the experiments, automatic scores such as BLEU, NIST, WER, PER, METEOR and GTM were used. According to the experimental results, GTM gives the highest correlation, 0.74 for an automatic score and TOEIC. By adding 4 parameters (sentence length, word length of the translation of the English learners, etc.) for the multiple linear regression analysis, the correlation improves to 0.76.
Content from these authors
© 2009 The Association for Natural Language Processing
Previous article Next article
feedback
Top