Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Evaluating Translation Quality with Word Order Correlations
Tsutomu HiraoHideki IsozakiKatsuhito SudohKevin DuhHajime TsukadaMasaaki Nagata
Author information
JOURNAL FREE ACCESS

2014 Volume 21 Issue 3 Pages 421-444

Details
Abstract
Automatic evaluation of Machine Translation (MT) quality is essential to develop high-quality MT systems. Various evaluation metrics have proposed, and among them, BLEU is widely used as the de facto standard metric. BLEU counts N-grams common between reference and hypothesis translation. On the other hand, ROUGE-L counts longest common subsequences. However, these methods have some problems. People give high scores to Rule-based MT (RBMT), but these methods do not, because RBMT tends to use alternative words. Conventional metrics are severe against the difference of words, but people accept them if the translation has the same meaning. Statistical MT (SMT) tends to translate “A because B” as “B because A” in case of translation between Japanese and English. BLEU does not care about global word order, and this severe mistake is not penalized very much. In order to consider global word order, this paper proposes a lenient automatic evaluation metric based on rank correlation of word order. By focusing on only words common between the two translations, this method is lenient with the use of alternative words. The difference of words is measured by precision of words, and its weight is controlled by a parameter. By using submissions of NTCIR-7 & 9’s Patent Translation task, the proposed method outperforms conventional measures in terms of system level comparison.
Content from these authors
© 2014 The Association for Natural Language Processing
Previous article Next article
feedback
Top