再学習による翻訳モデルを用いた単語アライメントの向上

山田 節夫; 永田 昌明; 山田 賢治

doi:10.5715/jnlp.12.2_175

Abstract

The statistical Machine Translation Model has two components: a language model and a translation model. This paper describes how to improve the quality of the translation model at the point of word alignment quality by using the common word pairs extracted by two asymmetric learning approaches. One set of word pairs is extracted by Viterbi alignment using a translation model, the other set is extracted by Viterbi alignment using another translation model created by reversing the languages. The common word pairs are extracted as the same word pairs in the two sets of word pairs. We conducted experiments using English and Japanese. Our method improves the quality of a original translation model by 5.7% independent of the training domain and the translation model.We also show that common word pairs are almost as useful as regular dictionary entries for training purposes. Moreover, we describe effects of the common word pairs by iterating our learning process and changing the number of learning data.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!