自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
一般論文(査読有)
Subset Retrieval Nearest Neighbor Machine Translation
Hiroyuki DeguchiTaro WatanabeYusuke MatsuiMasao UtiyamaHideki TanakaEiichiro Sumita
著者情報
ジャーナル フリー

2024 年 31 巻 2 号 p. 374-406

詳細
抄録

k nearest neighbor machine translation (kNN-MT) (Khandelwal et al. 2021) boosts the translation quality of trained neural machine translation (NMT) models by incorporating an example search into the decoding algorithm. However, decoding is seriously time-consuming, that is, roughly 100 to 1,000 times slower than that of standard NMT, because neighbor tokens are retrieved from all the target tokens of parallel data in each timestep. In this paper, we propose “Subset kNN-MT”, which improves the decoding speed of kNN-MT using two methods: (1) retrieving neighbor target tokens from a subset that is the set of neighbor sentences of the input sentence, not from all sentences, and (2) efficient distance computation technique suitable for subset neighbor search using a look-up table. Our subset kNN-MT achieved a speed-up of up to 134.2 times and an improvement in the BLEU score of up to 1.6 compared with those of kNN-MT in the WMT’19 De-En translation task, domain adaptation tasks in De-En and En-Ja translations, and the Flores101 multilingual translation task.

著者関連情報
© 2024 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top