Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Hierarchical Sub-sentential Alignment with IBM Models for Statistical Phrase-based Machine Translation
Hao WangYves Lepage
Author information
JOURNAL FREE ACCESS

2017 Volume 24 Issue 4 Pages 619-646

Details
Abstract

In this paper, we describe a novel method for joint word alignment and symmetrization. Based on initial parameters from simple IBM models, we synchronously parse the parallel sentence pair under the framework of bracket transduction grammar constraints. Our 2-phase method can achieve nearly the same run-time as fast_align while delivering better alignments on distantly-related language pairs such as English–Japanese. We show how to integrate this method into a standard phrase-based SMT pipeline. Although the alignment quality results are mixed, by forcing all words to be aligned (1-to-many/many-to-1), our method significantly reduces the phrase table size with no difference in translation quality and even outperforms fast_align in some end-to-end translation experiments.

Content from these authors
© 2017 The Association for Natural Language Processing
Previous article
feedback
Top