大規模日本語文法の開発

野呂 智哉; 橋本 泰一; 徳永 健伸; 田中 穂積

doi:10.5715/jnlp.12.3

Abstract

Although large-scale grammars are prerequisite for parsing a great variety of sentences, it is difficult to build such grammars by hand.Yet, it is possible to derive a context-free grammar (CFG) automatically from an existing large-scale, syntactically annotated corpus.While seemingly a simple task, CFGs derived in such fashion have seldom been applied to existing systems.This is probably due to a great number of possible parse results (i.e.high ambiguity).In this paper, we analyze some causes of high ambiguity, and we propose a policy for building a large-scale Japanese CFG for syntactic parsing, capable of decreasing ambiguity.We also provide an experimental evaluation of the obtained CFG showing reduction in the number of parse results (reduced ambiguity) created by the CFG and the improved parsing accuracy.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!