IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
A Novel Iterative Speaker Model Alignment Method from Non-Parallel Speech for Voice Conversion
Peng SONGWenming ZHENGXinran ZHANGYun JINCheng ZHAMinghai XIN
Author information

2015 Volume E98.A Issue 10 Pages 2178-2181


Most of the current voice conversion methods are conducted based on parallel speech, which is not easily obtained in practice. In this letter, a novel iterative speaker model alignment (ISMA) method is proposed to address this problem. First, the source and target speaker models are each trained from the background model by adopting maximum a posteriori (MAP) algorithm. Then, a novel ISMA method is presented for alignment and transformation of spectral features. Finally, the proposed ISMA approach is further combined with a Gaussian mixture model (GMM) to improve the conversion performance. A series of objective and subjective experiments are carried out on CMU ARCTIC dataset, and the results demonstrate that the proposed method significantly outperforms the state-of-the-art approach.

Information related to the author
© 2015 The Institute of Electronics, Information and Communication Engineers
Previous article Next article