Interpreting Attention Mechanisms of NMT with Linguistic Features

Guanghui CAI; Junguo ZHU

doi:10.1587/transinf.2024EDP7292

Abstract

Deep learning has transformed Neural Machine Translation (NMT), but the complexity of these models makes them hard to interpret, thereby limiting improvements in translation quality. This study explores the widely used Transformer model, utilizing linguistic features to clarify its inner workings. By incorporating three linguistic features—part-of-speech, dependency relations, and syntax trees—we demonstrate how the model's attention mechanism interacts with these features during translation. Additionally, we improved translation quality by masking nodes that were identified to have negative effects. Our approach bridges the complex nature of NMT with clear linguistic knowledge, offering a more intuitive understanding of the model's translation process.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!