Host: The Japanese Society for Artificial Intelligence
Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 37
Location : [in Japanese]
Date : June 06, 2023 - June 09, 2023
In recent years, acquisition of strategies has been successfully achieved in video games as well as board games by using self-play. In this research, we report on a study of strategy learning in single player and competitive falling-puzzle game Puyo-Puyo using self-play and deep reinforcement learning. Self-Play is a method in which agents play against each other. In this experiment, we created a puzzle game environment using Unity and ML-Agents and trained using the deep reinforcement learning algorithm SAC. The single player Puyo-Puyo was evaluated on cumulative rewards and maximum number of chains. Although there was a temporary improvement in performance, the result was a little worse. In the competitive Puyo-Puyo was evaluated on Elo-Rating and maximum number of chains. Elo-Rating increased from 1200 to 3100 and it was on an upward trend. It is possible that future studies will make it stronger.