Abstract
The utterance units that serve as input to speech translation and/or spoken dialogue systems that handle spontaneous speech are not always sentences. However, the processing units of language translation are sentences. Since we do not have enough knowledge about the sentences of spoken languages, we use the term “meaningful chunks” instead of sentences. First, using conventionally interpreted dialogue data, we show that utterance units sometimes need to be divided into several meaningful chunks, and sometimes need to be connected to make up a single meaningful chunk. Next, we propose a method of transforming from utterance units to meaningful chunks based on pause information and the N-gram of fine-grained part-of-speech subcategories. We have conducted experiments and have confirmed that our method yields good results.