Acoustical Science and Technology
Online ISSN : 1347-5177
Print ISSN : 1346-3969
ISSN-L : 0369-4232
PAPERS
Formant estimation of high-pitched noisy speech using homomorphic deconvolution of higher-order group delay spectrum
Husne Ara ChowdhuryMohammad Shahidur Rahman
著者情報
ジャーナル フリー

2023 年 44 巻 2 号 p. 84-92

詳細
抄録

Estimating the formant frequencies of high-pitched speech is essential in many speech processing applications. Unfortunately, most existing methods cannot accurately estimate the formant frequencies from high-pitched speech. Moreover, the available formant estimators do not show noise immunity. In this paper, we propose a higher-order group delay (GD) spectrum-based deconvolution method for formant estimation of high-pitched noisy speech with higher accuracy. Although cepstrum is known to provide a source-filter separation, to some extent, it gets affected by ambient noise. We employ the spectral-root-deconvolution technique on the third-order GD spectrum that yields a noise-robust cepstrum. The resulting cepstrum is found to produce significant improvement when estimating formant frequencies. We evaluated the proposed method on five synthetic vowels and some natural vowels spoken by male and female speakers by calculating the estimation error of the formant frequencies and standard F2–F1 plots, respectively. An utterance from the Texas Instruments and Massachusetts Institute of Technology (TIMIT) database has been utilized to plot the formant contours on the respective spectrogram. We compared the results with the three state-of-the-art methods. Our proposed technique outperforms all approaches, particularly with high-pitched speaking in a noisy environment.

著者関連情報
© 2023 by The Acoustical Society of Japan
前の記事 次の記事
feedback
Top