講演のリアルタイム字幕生成のための逐次的な改行挿入

大野 誠寛; 村田 匡輝; 松原 茂樹

doi:10.1541/ieejeiss.133.418

抄録

To generate readable captions for Japanese spoken monologue such as lectures in real time, it is necessary to sequentially display the caption into which proper linefeeds are inserted. This paper proposes a technique for sequentially inserting proper linefeeds into the lecture transcript whenever a bunsetsu, which is a linguistic unit shorter than a sentence in Japanese that roughly corresponds to a basic phrase in English, is identified. Assuming that linefeeds are inserted into bunsetsu boundaries, the technique can make the delay time of captioning shortest. The technique statistically judges whether or not a linefeed should be inserted into each bunsetsu boundary by using the information which is available at the time. We conducted experiments on linefeed insertion using a Japanese lecture corpus. From the experimental results, we confirmed that our method, which is bunsetsu-based linefeed insertion method, had almost as much accuracy as the sentence-based linefeed insertion method. In addition, we conducted the comparative evaluations with four baseline methods. As the results, we confirmed that our method could insert linefeeds more properly than the simple methods which are thought to have as same delay time as our method.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）