映像情報メディア学会大会講演予稿集
映像情報メディア学会2003年年次大会講演予稿集
セッションID: 14-4
会議情報

視聴覚情報の確率的統合による音源位置推定
*陳 彬目黒 光彦金子 正秀
著者情報
会議録・要旨集 フリー

詳細
抄録
This paper proposes a method to estimate sound source location by fusing auditory and visual information with Bayesian net. Since there are several auditory categories corresponding to different visual features, the sound signal is firstly classified into speech and non-speech categories, each of which correlate with skin-color and other color features distributed in the image, respectively. After modeling skin-color feature with Gaussian mixture model, we introduce Bayesian net to infer whether the pixels in the image correspond to sound source or not. Finally, the experimental results are presented to show the effectiveness of the proposed method.
著者関連情報
© 2003 (社)映像情報メディア学会
前の記事 次の記事
feedback
Top