Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Automatic Transformation of Lecture Transcription into Document Style using Statistical Framework
KAZUYA SHITAOKAHIROAKI NANJOTATSUYA KAWAHARA
Author information
JOURNAL FREE ACCESS

2004 Volume 11 Issue 2 Pages 67-83

Details
Abstract

Transcriptions and speech recognition results of lectures include many expressions peculiar to spoken language. Thus, it is necessary to transform them into document style for practical use of them. We apply the statistical approach used by machine translation to automatic transformation of the spoken language into document style sentences. We deal with deletion of fillers, insertion of periods, insertion of particles, conversion to written expressions and unification of the end-of-sectence style. A beam search is introduced to apply these processings in an integrated manner. Experimental evaluation using real lecture transcriptions comfirms that the statistical transformation framework works well and we achieved high recall and precision rates of period and particle insertion.

Content from these authors
© The Association for Natural Language Processing
Previous article Next article
feedback
Top