Mapping Articulatory-Features to Vocal-Tract Parameters for Voice Conversion

Narpendyah Wisjnu ARIWARDHANI; Masashi KIMURA; Yurie IRIBE; Kouichi KATSURADA; Tsuneo NITTA

doi:10.1587/transinf.E97.D.911

Regular Section

Mapping Articulatory-Features to Vocal-Tract Parameters for Voice Conversion

Narpendyah Wisjnu ARIWARDHANI, Masashi KIMURA, Yurie IRIBE, Kouichi KATSURADA, Tsuneo NITTA

Author information

Keywords: voice conversion, articulatory feature, neural network, arbitrary speaker

JOURNAL FREE ACCESS

2014 Volume E97.D Issue 4 Pages 911-918

DOI https://doi.org/10.1587/transinf.E97.D.911

Details

Abstract

In this paper, we propose voice conversion (VC) based on articulatory features (AF) to vocal-tract parameters (VTP) mapping. An artificial neural network (ANN) is applied to map AF to VTP and to convert a speaker's voice to a target-speaker's voice. The proposed system is not only text-independent VC, in which it does not need parallel utterances between source and target-speakers, but can also be used for an arbitrary source-speaker. This means that our approach does not require source-speaker data to build the VC model. We are also focusing on a small number of target-speaker training data. For comparison, a baseline system based on Gaussian mixture model (GMM) approach is conducted. The experimental results for a small number of training data show that the converted voice of our approach is intelligible and has speaker individuality of the target-speaker.

Corresponding author

Register with J-STAGE for free!