Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
PAPERS
Formant estimation of high-pitched noisy speech using homomorphic deconvolution of higher-order group delay spectrum
Husne Ara ChowdhuryMohammad Shahidur Rahman
Author information
JOURNAL FREE ACCESS

2023 Volume 44 Issue 2 Pages 84-92

Details
Abstract

Estimating the formant frequencies of high-pitched speech is essential in many speech processing applications. Unfortunately, most existing methods cannot accurately estimate the formant frequencies from high-pitched speech. Moreover, the available formant estimators do not show noise immunity. In this paper, we propose a higher-order group delay (GD) spectrum-based deconvolution method for formant estimation of high-pitched noisy speech with higher accuracy. Although cepstrum is known to provide a source-filter separation, to some extent, it gets affected by ambient noise. We employ the spectral-root-deconvolution technique on the third-order GD spectrum that yields a noise-robust cepstrum. The resulting cepstrum is found to produce significant improvement when estimating formant frequencies. We evaluated the proposed method on five synthetic vowels and some natural vowels spoken by male and female speakers by calculating the estimation error of the formant frequencies and standard F2–F1 plots, respectively. An utterance from the Texas Instruments and Massachusetts Institute of Technology (TIMIT) database has been utilized to plot the formant contours on the respective spectrogram. We compared the results with the three state-of-the-art methods. Our proposed technique outperforms all approaches, particularly with high-pitched speaking in a noisy environment.

Content from these authors
© 2023 by The Acoustical Society of Japan
Previous article Next article
feedback
Top