Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper (Peer-Reviewed)
Neural Machine Translation with Synchronous Latent Phrase Structure
Shintaro HaradaTaro Watanabe
Author information
JOURNAL FREE ACCESS

2022 Volume 29 Issue 2 Pages 587-610

Details
Abstract

It has been reported that grammatical information is useful for machine translation (MT) tasks. However, the annotation of grammatical information incurs significant human costs. Furthermore, it is not trivial to adapt grammatical information to MT because grammatical annotation usually employs tokenization standards that might not capture the relation between two languages and consequently, subword tokenization such as byte-pair-encoding is used to alleviate out-of-vocabulary problems; however, this might not be compatible with those annotations. In this work, we introduce two methods to incorporate grammatical information without supervising annotation explicitly: first, the latent phrase structure is induced in an unsupervised fashion from an attention mechanism; and second, the induced latent phrase structures in the encoder and decoder are synchronized so that they are compatible with each other using constraints during training. We demonstrate that our approach performs better in two tasks: translation and word alignment, without extra resources. We found that the induced phrase structures enhance the precision of alignments through the synchronization constraint after exact phrase and alignment structure analysis.

Content from these authors
© 2022 The Association for Natural Language Processing
Previous article Next article
feedback
Top