Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Development and Evaluation of Japanese Clause Boundaries Annotation Program
TAKEHIKO MARUYAMAHIDEKI KASHIOKATADASHI KUMANOHIDEKI TANAKA
Author information
JOURNAL FREE ACCESS

2004 Volume 11 Issue 3 Pages 39-68

Details
Abstract

Sentences generally tend to be long and complicated in monologues, and they cause problems for parsing and translation. It is desirable to define some short unit to process monologues efficiently. We developed “CBAP (Clause Boundaries Annotation Program), ” which detects and labels every clause boundary in Japanese text. CBAP accepts a series of morphemes with part-of-speech information and detectsthe final boundary of every clause with more than 97% accuracy. It also inserts 147 kinds of labels which represent the types of the boundaries. Since clauses are syntactically and semantically sufficient constituents, we can use the annotated labels for effective and flexible sentence segmentation. In this paper, we show the method for annotating Japanese clause boundaries, and present the result of experiments to examine the performance of CBAP.

Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top