IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<Sound and Image Processing and Recognition>
Sound Source Separation with Two Spectrograms by Image Processing
Hiroaki HiguchiKensaku AsahiYuji SagawaNoboru Sugie
Author information
JOURNAL FREE ACCESS

2004 Volume 124 Issue 12 Pages 2439-2445

Details
Abstract

We propose a method for separating speeches using two spectrograms. First, two spectrograms are generated from voices recorded with a pair of microphones. The onsets and the offsets of the frequency components are extracted as the features using image processing techniques. Then the correspondences of the features between the spectrograms are determined and the intermicrophone time differences are calculated. Each of frequency components with the common onset/offset occurrences and time difference are grouped together as originating one of the speech signals. A set of band-pass filters are generated corresponding to each group of frequency components. Finally, each of the separated speech signals is extracted by applying the set of band-pass filters to the voice signal recorded by a microphone. Experiments were conducted with the mixture of a male speech sound and a female speech sound consisting of Japanese vowel and contain consonant. The evaluation results demonstrated that the separation was done reasonably well with the proposed method.

Content from these authors
© 2004 by the Institute of Electrical Engineers of Japan
Previous article Next article
feedback
Top