Experimental evaluation of the effect of phoneme time stretching on speaker embedding

Taichi Fukawa; Kenya Jin'no

doi:10.1587/nolta.13.277

Special Section on Nonlinear Science Workshop on the Journal

Experimental evaluation of the effect of phoneme time stretching on speaker embedding

Taichi Fukawa, Kenya Jin'no

著者情報

キーワード: voice conversion, speaker embedding, phoneme, spectrogram, CNN, SVM

ジャーナルフリー

2022 年 13 巻 2 号 p. 277-281

DOI https://doi.org/10.1587/nolta.13.277

詳細

抄録

For an indefinite length spectrogram sequence of phonemes, we experimentally verified two methods of obtaining speaker embedding by transforming it to fixed length: adding padding and time stretching. We confirmed that both methods can maintain the extraction performance. We also confirm that the fixed frame length does not affect the results.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）