Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
37th (2023)
Session ID : 2D4-GS-2-02
Conference information

Using Search Results in Self-play Deep Reinforcement Learning
*Kazuya KAGOSHIMAItsuki NODASatoshi OYAMA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We propose a new method for training data generation in self-play deep reinforcement learning, which are widely used in Game-AI like AlphaGoZero, AlphaZero, and so on. Generally, such self-play learning has not utilized most of search results that are generated in self-play. Currently, few researches try to make use of them. The proposed method converts the search result to training data by estimating final win/lose rewards and policy for it. The experimental investigation with various hyperparameters for the training suggests that the proposed method will help learning the policy effectively and stabilize the training.

Content from these authors
© 2023 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top