Abstract
In this paper, a new learning algorithm for variable hierarchical structure learning automata operating in a P-model stationary random environment is constructed by extending the estimator algorithm proposed by Thathachar and Sastry. The learning propertiy of our algorithm is considered theoretically, and it is proved that the optimal path probability can be approached 1 as much as possible by using our algorithm. In numerical simulation, the average number of iterations of our algorithm is compared with that of the variable hierarchical structure learning algorithm of LR-I type proposed by Mogami and Baba, and it is shown that our algorithm can find the optimal path after the smaller number of iterations than that of the algorithm of LR-I type.