A Master Equation Formulation of the Reinforcement Scheme of Stochastic Learning Automata

Fei Qian; Hironori Hirata

doi:10.5687/sss.1998.273

Abstract

For judging the convergence property of reinforcement learning algorithms, we formulate the learning scheme in terms of a discrete Markov process, and transform its equation into a continuous time master equation. By making a small perturbation for as mall learning parameter, we derive a small perturbation expansion of the master equation to get a Fokker-Planck equation approximation with the low-order of the learning parameters. In here, we show that the global features of reinforcement scheme of learning automata can be described within this approximation due to the fact that the deterministic term of the dynamics has a globally asymptotically stable fixed point.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!