Host: Japan Society for Fuzzy Theory and Intelligent Informatics (SOFT)
Name : 34th Fuzzy System Symposium
Number : 34
Location : [in Japanese]
Date : September 03, 2018 - September 05, 2018
A typical fusion of fuzzy inference and reinforcement learning uses a value-based method such as Q-learning assuming Markov decision process. On the other hand, we have proposed a fusion of fuzzy inference and policy gradient method, which is a policy-based and learns a policy directly, unlike a value-based method. The fusion uses a stochastic policy defined by Boltzmann distribution having an objective function consisting of the product-sum operation for membership functions and rule weights. Moreover, we proposed another objective function by using defuzzification based on a center of gravity model weighted stochastically and a constraint condition on the vibration of the output. In this study, we applied the fusion to simulations on speed control of an automobile and compared the objective functions. The results showed that the policies learned by our method, which uses center of gravity model and a constraint condition, tended to suppress vibration of the speed and accomplish the control task with a small number of steps.