Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 2K4-ES-2-03
Conference information

Many-to-many Voice Conversion based on a CycleGAN using a Radial Loss
*Iskandar SALAMA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Voice conversion (VC) is a technique that allows a person to speak with the voice of another person. It's one of applications of voice processing that depends on both signal processing and machine learning to achieve it. In this paper we propose a many-to-many voice conversion method based on a CycleGan which we call the Radial CycleGan. In this method, generators consist of a general encoder(ENC-0) and general decoder(DEC-0 for a standard voice sample (TTS voice) and a pair of an encoder and decoder for any new voice sample. We define radial loss between encoders and decoders in addition to commonly used cycle and identity losses to train generators and discriminators. The process of training for each new user aims to train a new pair of (encoder, decoder) on the standard pair of TTS which makes it possible to convert voices directly on the trained pair of encoder and decoder of the training. This method will contribute in creating real-time systems that are able to convert among pretrained speaker’s voices in a robust way and easy to add new users through collecting small datasets from them.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top