IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Info-Plosion
The Effect of Corpus Size on Case Frame Acquisition for Predicate-Argument Structure Analysis
Ryohei SASANODaisuke KAWAHARASadao KUROHASHI
Author information
JOURNAL FREE ACCESS

2010 Volume E93.D Issue 6 Pages 1361-1368

Details
Abstract

This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicate-argument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

Content from these authors
© 2010 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top