Abstract
In this paper, we propose a hierarchical phrase alignment method that aims to acquire translation knowledge.Previous methods utilize the correspondence of sub-trees between bilingual parsing trees after determining the parsing result.The method described in this paper combines partial tree candidates, and selects the best sequence of partial trees.Then, a structural similarity measure (called a`phrase score') is used for evaluation.A forward DP backward Asearch algorithm is applied in order to combine partial trees.Using this method, about twice as many as equivalent phrases were extracted experimentally, and almost no deterioration was observed.
This method employs word alignment.The accuracy of the phrase alignment increases when we consider the word correspondences between not only content words but also functional words.In addition, we found that a word alignment method with a high recall rate is suitable for this method.