2018 Volume 25 Issue 5 Pages 577-597
Despite its promise, neural machine translation (NMT) presents a serious problem in that source content may be mistakenly left untranslated. The ability to detect untranslated content is important for the practical use of NMT. We evaluated two types of probability with which to identify untranslated content: the cumulative attention probability and the back translation probability from a target sentence to the source sentence. Experiments were conducted to discover missing content in Japanese to English patent translations. The results of the investigation revealed that both the types of probability were each effective, back translation was more effective than attention, and the combination of the two resulted in further improvements. Furthermore, we confirmed that the detection of untranslated content was effectual in terms of sentence selection for the human post-editing processing of machine translation results.