Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation
Kenji ImamuraEiichiro Sumita
Author information
JOURNAL FREE ACCESS

2017 Volume 24 Issue 4 Pages 597-618

Details
Abstract

Domain adaptation is a major challenge when machine translation is applied to practical tasks. In this study, we present domain adaptation methods for machine translation that assume multiple domains. The proposed methods combine two typesof models: a corpus-concatenated model covering multiple domains and single-domain models that are accurate but sparse in specific domains. We combine the advantages of both the models using feature augmentation for domain adaptation in machine learning; however, a conventional method of feature augmentation for machine translation uses a single model. Our experimental results show that the translation qualities of the proposed method improved or were at the same level as those of the single-domain models. The proposed method is extremely effective in low-resource domains. Even in domains having a million bilingual sentences, the translation quality was at least preserved and even improved in some domains. These results demonstrate that state-of-the-art domain adaptations can be realized with appropriate model selection and appropriate settings, even when standard log-linear models are used.

Content from these authors
© 2017 The Association for Natural Language Processing
Previous article Next article
feedback
Top