ニューラル機械翻訳での訳抜けした内容の検出

後藤 功雄; 田中 英輝

doi:10.5715/jnlp.25.577

Abstract

Despite its promise, neural machine translation (NMT) presents a serious problem in that source content may be mistakenly left untranslated. The ability to detect untranslated content is important for the practical use of NMT. We evaluated two types of probability with which to identify untranslated content: the cumulative attention probability and the back translation probability from a target sentence to the source sentence. Experiments were conducted to discover missing content in Japanese to English patent translations. The results of the investigation revealed that both the types of probability were each effective, back translation was more effective than attention, and the combination of the two resulted in further improvements. Furthermore, we confirmed that the detection of untranslated content was effectual in terms of sentence selection for the human post-editing processing of machine translation results.

Content from these authors

Licensed under CC BY 4.0
https://creativecommons.org/licenses/by/4.0/

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!