『ぷよぷよ』における深層強化学習による自己対戦の適応

福地 昂大; 三宅 陽一郎

doi:10.11517/pjsai.JSAI2023.0_2M5GS1001

Abstract

In recent years, acquisition of strategies has been successfully achieved in video games as well as board games by using self-play. In this research, we report on a study of strategy learning in single player and competitive falling-puzzle game Puyo-Puyo using self-play and deep reinforcement learning. Self-Play is a method in which agents play against each other. In this experiment, we created a puzzle game environment using Unity and ML-Agents and trained using the deep reinforcement learning algorithm SAC. The single player Puyo-Puyo was evaluated on cumulative rewards and maximum number of chains. Although there was a temporary improvement in performance, the result was a little worse. In the competitive Puyo-Puyo was evaluated on Elo-Rating and maximum number of chains. Elo-Rating increased from 1200 to 3100 and it was on an upward trend. It is possible that future studies will make it stronger.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!