Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
A Method for Retrieving a Similar Sentence and Its Application to Speech Translation
MITSUO SHIMOHATAEIICHIRO SUMITAYUJI MATSUMOTO
Author information
JOURNAL FREE ACCESS

2004 Volume 11 Issue 4 Pages 105-126

Details
Abstract
When we apply input sentences of spoken language to a machine translation, wesometimes cannot get proper translations due to the characteristics of spoken language.In this paper, we propose a method for recovering proper translations bycombining similar sentence retrieval with machine translation when it is difficult toget a proper translation of the input sentence. If a given input sentence is found tobe difficult to translate properly, a sentence similar to the input sentence is retrievedfrom a corpus of translatable sentences. The similarity between the candidate and theinput sentence is determined from the ratio of the N-gram overlap. In addition, weuse two additional conditions to improve the retrieval performance: excluding candidatesentences with a content word that does not exist in the input sentence, anddecreasing the weight of functional words.In an experiment of retrieval in Japanese, our method outputs retrieved sentences for 87.2% of all input sentences and 60.4%of them are similar sentences. In an experiment of combining our method and machinetranslation, in which untranslatable input sentences are replaced with similarsentences from a translatable corpus, our method recovered proper translations from25.9%of the untranslatable input sentences.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top