Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
Chenhui ChuRaj DabreSadao Kurohashi
著者情報
ジャーナル フリー

2018 年 26 巻 p. 529-538

詳細
抄録

Neural machine translation (NMT) has shown very promising results when there are large amounts of parallel corpora. However, for low resource domains, vanilla NMT cannot give satisfactory performance due to overfitting on the small size of parallel corpora. Two categories of domain adaptation approaches have been proposed for low resource NMT, i.e., adaptation using out-of-domain parallel corpora and in-domain monolingual corpora. In this paper, we conduct a comprehensive empirical comparison of the methods in both categories. For domain adaptation using out-of-domain parallel corpora, we further propose a novel domain adaptation method named mixed fine tuning, which combines two existing methods namely fine tuning and multi domain NMT. For domain adaptation using in-domain monolingual corpora, we compare two existing methods namely language model fusion and synthetic data generation. In addition, we propose a method that combines these two categories. We empirically compare all the methods and discuss their benefits and shortcomings. To the best of our knowledge, this is the first work on a comprehensive empirical comparison of domain adaptation methods for NMT.

著者関連情報
© 2018 by the Information Processing Society of Japan
前の記事 次の記事
feedback
Top