This paper studies how to compile a bilingual lexicon for technical terms using the Web. In the task of estimating bilingual term correspondences of technical terms, it is usually rather difficult to find an existing corpus for the domain of such technical terms. In this paper, we adopt an approach of collecting a corpus for the domain of such technical terms from the Web. As a method of translation estimation for technical terms, we employ a compositional translation estimation technique, where translation candidates of a term are compositionally generated by concatenating the translation of the constituents of the term. Then, the generated translation candidates are validated using the domain/topic-specific corpus collected from the Web. This paper further quantitatively compares the proposed approach with another approach of validating translation candidates directly through a search engine. We show that the domain/topic-specific corpus collected from the Web contributes to achieving higher precision in translation candidate validation.
View full abstract