拡散モデルと音楽属性を用いた感情による制御が可能な音楽生成への取り組み

川邉 もゆ; 小林 一郎

doi:10.14864/fss.41.0_248

Abstract

In the field of music generation using diffusion models, there have been few reports on methods that generate music controlled by emotions. One of the reasons for this is the difficulty of controlling complex musical attributes. Therefore, in this study, we aim to develop a MIDI-format music generation method that can control diverse musical generation by using a diffusion language model that generates discrete sequence data. Additionally, we attempted to develop a music generation method that reflects subtle changes in emotions by expressing the emotions provided to the model as coordinate values on a Russell ’s Circumplex Model. Furthermore, we developed two types of diffusion models: one that uses a classifier in the reverse diffusion process to control “ emotion-correlated music attributes, ” and another that conditions “ emotion-correlated music attributes ” without a classifier. We compared the music generated by the two methods. For each method, we will generate music for multiple inputs and analyze and confirm the differences between the generated music and the degree of emotion reflection to evaluate the musicality and emotion control of each model.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!