Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Transformation into Meaningful Chunks by Dividing or Connecting Utterance Units
TOSHIYUKI TAKEZAWATSUYOSHI MORIMOTO
Author information
JOURNAL FREE ACCESS

1999 Volume 6 Issue 2 Pages 83-95

Details
Abstract
The utterance units that serve as input to speech translation and/or spoken dialogue systems that handle spontaneous speech are not always sentences. However, the processing units of language translation are sentences. Since we do not have enough knowledge about the sentences of spoken languages, we use the term “meaningful chunks” instead of sentences. First, using conventionally interpreted dialogue data, we show that utterance units sometimes need to be divided into several meaningful chunks, and sometimes need to be connected to make up a single meaningful chunk. Next, we propose a method of transforming from utterance units to meaningful chunks based on pause information and the N-gram of fine-grained part-of-speech subcategories. We have conducted experiments and have confirmed that our method yields good results.
Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top