2020 Volume 27 Issue 2 Pages 281-298
In this paper, we propose a novel model for transformer neural machine translation that incorporates syntactic distances between two source words into the relative position representations of a self-attention mechanism. In particular, the proposed model encodes pair-wise relative depths on a source dependency tree, which are the differences between the depths of two source words, in the encoder’s self-attention. Experiments show that our proposed model outperformed non-syntactic Transformer NMT baselines on the Asian Scientific Paper Excerpt Corpus Japanese-to-English and English-to-Japanese translation tasks. In particular, our proposed model achieved a 0.37 point gain in BLEU on the Japanese-to-English task.