Journal of Signal Processing
Online ISSN : 1880-1013
Print ISSN : 1342-6230
ISSN-L : 1342-6230
Study on Robust Voice Activity Detection Using Empirical Mode Decomposition and Modulation Spectrum Analysis
Yasuaki KanaiMasashi Unoki
Author information
JOURNAL FREE ACCESS

2012 Volume 16 Issue 4 Pages 315-318

Details
Abstract
Voice activity detection (VAD) is used to detect speech/nonspeech periods in observed signals and it is a very important technique for various speech signal processes. However, there is a serious problem in that the accuracy of detection of speech periods drastically reduces if the current VAD technique is used for noisy speech and/or for mixtures of speech/non-speech such as those in music and animal sounds. Thus, VAD needs to be robust to enable speech periods to be accurately detected in these situations. This paper proposes a robust method of VAD using empirical mode decomposition (EMD) and modulation spectrum analysis (MSA) to resolve these problems. The proposed method reduces noise by using EMD, and then determines speech/non-speech periods by using MSA. Five experiments on VAD in real environments were conducted to evaluate the proposed method by comparing it with traditional methods (OTSU's, the G.729, and power envelope thresholding methods). The results demonstrated that the proposed method could accurately detect speech periods more accurately than the traditional methods.
Content from these authors
© 2012 Research Institute of Signal Processing, Japan
Previous article Next article
feedback
Top