2005 Volume 12 Issue 5 Pages 91-110
This paper proposes a method of extracting a bilingual pair of a syntactically am-biguous named entity and its counterpart from a sentence-aligned English-Japanese parallel corpus.This method computes the degree of semantic and phonetic similar-ities between an English named entity and its translation candidate, and calculates the overall score of the pair as the weighted sum of the two kinds of scores. It avoids extracting English named entities with wrong prepositional phrase attach-ment and/or wrong scope of coordination. In an experiment using a parallel corpus of Yomiuri Shimbun and The Daily Yomiuri, the proposed method has achieved the F-value of 0.678, which surpasses 0.583 marked by a baseline method.