We propose a novel pipeline method for translating signed Japanese sentences into written Japanese. Sign languages often suppress functional words such as particles, and most words are not morphologically inflected as they are in spoken languages. Our method explicitly compares and contrasts the two languages and divides the translation process into two tasks: first, it translates glosses into lemmatized Japanese words or phrases, followed by complementing particles and conjugating predicates such as verbs, auxiliary verbs, and adjectives. Our method is especially effective when the size of the parallel corpus is very limited and costly to obtain, but there are plenty of monolingual corpora for the target. Specifically, our method first uses phrase-based statistical machine translation (PBSMT) to map sign glosses to corresponding Japanese words or phrases, and then employs a transformer-based neural machine translation (NMT) model trained with a monolingual corpus to refine the output in the first translation. Experimental results show that our pipeline method outperforms direct PBSMT and competitive NMT models with data augmentation, including back-translation and transfer learning in a low-resource setting with a corpus size on the order of 104 words.
抄録全体を表示