Pitch dependent phone modelling for HMM-based speech recognition

Harald Singer; Shigeki Sagayama

doi:10.1250/ast.15.77

抄録

This paper proposes a novel method of incorporating pitch information into an HMM speech recognition system by exploiting the correlation between pitch and spectral parameters, e.g. cepstrum. Pitch patterns are not used explicitly; instead, spectral parameters are normalized framewise according to the pitch value. Evidence is given to show that the use of pitch information consistently improves the recognition performance. Experiments with 24 phoneme labels showed that the phoneme error rate for fast continuous speech could be improved by about 10%. Using these pitchnormalized phone models in an HMM-LR speech recognition system improved the phrase recognition accuracy for the top 5 candidates from 96% to 97.5%, i.e. the error rate was nearly halved.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

In Vivo Electropharmacological Effects of Amiodarone and Candesartan on Atria of Chronic Atrioventricular Block Dogs
Sense of Taste in Children with Congenital Microtia
散亂に於ける光子説の明かな失敗
イヌ・ネコにおける各種オメガ3脂肪酸含有量の異なるフードの給与試験
大腿骨転子部骨折に対するEndovis nailの手術手技

後続誌

Acoustical Science and Technology

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）