Abstract
In this paper, a new learning algorithm is developed for hierarchical structure learning automata with an S-model nonstationary random environment at each level by extending the relative reward strength algorithm proposed by Simha and Kurose. The learning propertiy of our algorithm is considered theoretically, and it is proved that the optimal path probability can be approached 1 as much as possible by using our algorithm. In numerical simulation, the number of iterations of our algorithm is compared with that of the hierarchical structure learning algorithm proposed by Thathachar and Ramakrishnan(T & R algorithm), and it is shown that our algorithm can find the optimal path after the smaller number of iterations than that of T & R algorithm.