Transactions of the Society of Instrument and Control Engineers
Online ISSN : 1883-8189
Print ISSN : 0453-4654
ISSN-L : 0453-4654
Reinforcement Learning Based on Statistical Value Function and Its Application to a Board Game
Ikuko NISHIKAWATomoyuki NAKANISHI
Author information
JOURNAL FREE ACCESS

2003 Volume 39 Issue 7 Pages 670-678

Details
Abstract

A statistical method is proposed to cope with a large number of discrete states in a given state space in reinforcement learning. As a coarse-graining of a large number of states, less number of sets of states are defined as a group of neighboring states. State sets partly overlap each other, and one state is included in a multiple sets. The learning is based on an action-value function for each state set, and an action-value function on an individual state is derived by a statistical average of multiple value functions on state sets at the time of an action selection. The proposed method is applied to a board game Dots-and-Boxes. The state sets are defined as subspace templates of a whole board state with dots and lines, taking a geometric symmetry into the consideration. A reward is given as a number of acquired boxes minus lost boxes. Computer simulations show a successful learning through the training games competing with a mini-max method of the search depth 2 to 5, and the winning rate against a depth-3 mini-max attains about 80%. An action-value function derived by a weighted average with the weight given by the variance of rewards shows the advantage compared with an action-value function derived by a simple average.

Content from these authors
© The Society of Instrument and Control Engineers (SICE)
Previous article Next article
feedback
Top