One of the problems in spoken language translation is the enormous variety of expressions not found in text translation. This volume can lead to a sparse translation coverage. In order to tackle this problem, we propose a machine translation model where an input is translated through both source-language and target-language paraphrasing processes. In this paper, we discuss the source paraphrasing and the language transfer processes, and the design of our translation model. In the source language paraphrasing, we take the practical approach of untangling slight variations in the source language before transferring a source expression to its target. We discuss how effective our paraphrasing process is in the sense of reducing varieties in a spoken language, with a focus on how many source language patterns are reduced by paraphrasing. In the translation model, we propose an interaction model between the source language paraphraser and the transfer, unlike the conventional assembly-line process flow. In our evaluation we illustrate that over 70% of the input utterances is expected to somehow be changed. Accordingly, we can achieve that one-fifth of all skeleton expressions can be merged into other skeletons, that increases chances of correct translations being obtained. Furthermore, we observe that our interaction model with the paraphraser increases 20-40 percentage points of translation capability, regardless of the transfer knowledge size.
View full abstract