Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
34th (2020)
Session ID : 1G3-ES-5-01
Conference information

Proposing system for generating audio influenced by audience evaluation using interactive GA
*Maho TANIGUCHIKense TODOShoya YASUDAMasayuki YAMAMURA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

When generating or selecting music/sound effects, it is necessary to search large audio databases to find an appropriate audio for the scene of animation or other video clips. However, the sound effects or background music generated by individual human experts may sometimes not make audience feel that it well matches with the scene. Therefore, an approach to generate audio considering listeners’ preferences is required. In this work, we suggest a way to generate a suitable audio for a scene using feedbacks from audience. In particular, we used SpecGAN, which is a kind of GAN that generated a wide variety of audio from latent space, and interactive GA, which is an optimization algorithm using human preferences in evaluation. In the process, the following steps were repeatedly done; SpecGAN generated audio from latent variables, human group ranks the audio, and the best group of latent variables were crossed over for create the next latent variables. As a result, we succeeded in controlling the direction of generating audio for individual scenes. We hope that the audio generated by the our method has significance as created by human experts.

Content from these authors
© 2020 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top