Information and Media Technologies
Online ISSN : 1881-0896
ISSN-L : 1881-0896
Media (processing) and Interaction
Unlabeled Dependency Parsing Based Pre-reordering for Chinese-to-Japanese SMT
Dan HanPascual Martínez-GómezYusuke MiyaoKatsuhito SudohMasaaki Nagata
著者情報
ジャーナル フリー

2014 年 9 巻 3 号 p. 272-301

詳細
抄録
In statistical machine translation, Chinese and Japanese is a well-known long-distance language pair that causes difficulties to word alignment techniques. Pre-reordering methods have been proven efficient and effective; however, they need reliable parsers to extract the syntactic structure of the source sentences. On one hand, we propose a framework in which only part-of-speech (POS) tags and unlabeled dependency parse trees are used to minimize the influence of parse errors, and linguistic knowledge on structural difference is encoded in the form of reordering rules. We show significant improvements in translation quality of sentences in the news domain over state-of-the-art reordering methods. On the other hand, we explore the relationship between dependency parsing and our pre-reordering method from two aspects: POS tags and dependencies. We observe the effects of different parse errors on reordering performance by combining empirical and descriptive approaches. In the empirical approach, we quantify the distribution of general parse errors along with reordering quality. In the descriptive approach, we extract seven influential error patterns and examine their correlations with reordering errors.
著者関連情報
© 2014 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top