Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Compositional Translation Estimation of Technical Terms Using a Domain/Topic-Specific Corpus Collected from the Web
MASATSUGU TONOIKETAKEHITO UTSUROSATOSHI SATO
Author information
JOURNAL FREE ACCESS

2007 Volume 14 Issue 2 Pages 33-68

Details
Abstract
This paper studies how to compile a bilingual lexicon for technical terms using the Web. In the task of estimating bilingual term correspondences of technical terms, it is usually rather difficult to find an existing corpus for the domain of such technical terms. In this paper, we adopt an approach of collecting a corpus for the domain of such technical terms from the Web. As a method of translation estimation for technical terms, we employ a compositional translation estimation technique, where translation candidates of a term are compositionally generated by concatenating the translation of the constituents of the term. Then, the generated translation candidates are validated using the domain/topic-specific corpus collected from the Web. This paper further quantitatively compares the proposed approach with another approach of validating translation candidates directly through a search engine. We show that the domain/topic-specific corpus collected from the Web contributes to achieving higher precision in translation candidate validation.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top