The authors propose a model for analyzing English sentences including coordinate conjunctions such as “and”, “or”, “but” and equivalent words. The syntactic analysis of English coordinate sentences is one of the most difficult problems in machine translation (MT) systems. The problem is selecting, from all possible candidates, the correct syntactic structure formed by an individual coordinate conjunction, i. e. determining which constituents are coordinated by the conjunction. Typically, so many possible structures are produced that MT systems cannot select the correct one, even if the grammars allow us to write the rules in simple notations. This paper presents an English coordinate structure analysis model, which provides top-down scope information on the correct syntactic structure by taking advantage of the symmetric patterns of parallelism. The model is based on a balance-matching operation for two lists of feature sets. It has four effects, namely: a reduction in analysis costs, a decrease in word disambiguation, the interpretation of ellipses, and robust analysis. This model was practically implemented and incorporated into the English-Japanese MT system, and it had about 70%accuracy for 3215Wall Street Journal sentences.
View full abstract