Exploiting Multi-Level Data Uncertainty for Japanese-Chinese Neural Machine Translation

Zezhong LI; Jianjun MA; Fuji REN

doi:10.1587/transinf.2024EDL8059

Abstract

The performance of Neural Machine Translation (NMT) heavily depends on the severity of data uncertainty existing in the training examples. In terms of its causes, data uncertainty can be categorized into intrinsic and extrinsic uncertainty, both of which can increase the learning difficulty in NMT, and lead to degradation in translation performance. To cope with this challenge, we propose a simple yet effective method to estimate the data uncertainty and incorporate it into the adaptive training of NMT, which can mitigate the hurt brought by the uncertain data. Our method consists of two modules: 1) we propose a mBERT based model for estimating token-level and sentence-level data uncertainties jointly, which is trained with multi-task learning; 2) we propose heterogeneous ways of incorporating the derived multi-level uncertainties into the NMT training by using soft-masked embedding and weighted loss. Extensive experiments on Japanese↔Chinese translation show that our proposed methods substantially outperform the strong baselines in terms of BLEU scores, and verify the effectiveness of modeling data uncertainty in NMT.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!