並列Actorと優先度付き経験再生を用いた深層強化学習によるぷよぷよAI

森 春介; 越野 亮

doi:10.3156/jsoft.37.1_501

Abstract

This study applies deep reinforcement learning to the puzzle game Puyo Puyo. Traditional rule-based methods and those utilizing relevance matrices have struggled to construct large chains comparable to those created by top human players. Furthermore, previous studies using deep reinforcement learning have found it difficult to learn complex strategies and have not demonstrated sufficient performance. This study aims to improve the performance of Puyo Puyo AI through deep reinforcement learning, employing parallel actors and prioritized experience replay. Experiments were conducted using a custom-built Puyo Puyo environment to evaluate the proposed method. The results showed that the proposed approach achieved an average maximum chain length of 6.243 and an average score of 33,114, surpassing the performance of previous deep reinforcement learning studies.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!