講義自動撮影における話者位置推定のための視聴覚情報の統合

西口 敏司; 東 和秀; 亀田 能成; 角所 考; 美濃 導彦

doi:10.1541/ieejeiss.124.729

抄録

It is useful for automatic video shooting in a lecture room to estimate the location of a speaker in the lecture room. The captured videos are used for distance learning and lecture archiving systems. In order to estimate the location of a speaker in a wide lecture room, multiple cameras and multiple microphones are used. However, it is difficult to estimate the precise location of a speaker using only visual or acoustic sensors because of calibration problems, noise, and other interference. Therefore, we propose a method that integrates audio and visual information from a speaker in the lecture room. A lecturer’s cell and a student’s cell ared introduced as a unit of estimation of the location of a speaker. We defined 120 cells in a real lecture room and our multi-modal method were applied to the cells. The estimation accuracy of the location of a speaker is sufficient for automatic video shooting of a speaker in a lecture room by our integrating method.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

【電気学会会員の方】購読している論文誌を無料でご覧いただけます（会員ご本人のみの個人としての利用に限ります）。購読者番号欄にMyページへのログインIDを，パスワード欄に生年月日8ケタ（西暦，半角数字。例：19800303）を入力して下さい。

ダウンロード

論文(PDF)の閲覧方法はこちら
閲覧方法 (327.9K)

前身誌

電気学会論文誌. C

電氣學會雜誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）