2006 Volume 42 Issue 11 Pages 1244-1251
In general reinforcement learning algorithms, a single agent learns to achieve a goal through many episodes. If a learning problem is complicated, it may take much computation time to obtain the optimal policy. Meanwhile, for optimization problems, multi-agent search methods such as genetic algorithms and particle swarm optimization are known to be able to find rapidly a global optimal solution for multi-modal functions with wide solution space. This paper proposes a swarm reinforcement learning algorithm (SWARLA) in which multiple agents learn through exchanging information each other. Furthermore, this paper proposes three strategies to exchange the information: the best action-value strategy, the average action-value strategy and the particle swarm strategy. The proposed algorithm is applied to a shortest path problem, and its performance is demonstrated through numerical experiments.