Abstract
Although large-scale grammars are prerequisite for parsing a great variety of sentences, it is difficult to build such grammars by hand.Yet, it is possible to derive a context-free grammar (CFG) automatically from an existing large-scale, syntactically annotated corpus.While seemingly a simple task, CFGs derived in such fashion have seldom been applied to existing systems.This is probably due to a great number of possible parse results (i.e.high ambiguity).In this paper, we analyze some causes of high ambiguity, and we propose a policy for building a large-scale Japanese CFG for syntactic parsing, capable of decreasing ambiguity.We also provide an experimental evaluation of the obtained CFG showing reduction in the number of parse results (reduced ambiguity) created by the CFG and the improved parsing accuracy.