Sound Source Localization Based on Probabilistic Fusion of Audiovisual Information

Bin Chen; Mitsuhiko Meguro; Masahide Kaneko

doi:10.11485/itetaikai.2003s.0.107.0

抄録

This paper proposes a method to estimate sound source location by fusing auditory and visual information with Bayesian net. Since there are several auditory categories corresponding to different visual features, the sound signal is firstly classified into speech and non-speech categories, each of which correlate with skin-color and other color features distributed in the image, respectively. After modeling skin-color feature with Gaussian mixture model, we introduce Bayesian net to infer whether the pixels in the image correspond to sound source or not. Finally, the experimental results are presented to show the effectiveness of the proposed method.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

通信路符号化をめぐる先陣競争(応用数理の遊歩道(30))
3P104 Significance of phospholipid composition in generating Min protein waves in vitro(03. Membrane proteins,Poster,The 52nd Annual Meeting of the Biophysical Society of Japan(BSJ2014))
めまい、突発難聴で発症した前下小脳動脈症候群の1例
Potential Universal Application of High-intensity Interval Training from Athletes and Sports Lovers to Patients
和文目次

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）