Proceedings of the Fuzzy System Symposium
34th Fuzzy System Symposium
Session ID : WD1-4
Conference information

proceeding
Policy Gradient Reinforcement Learning with a Fuzzy Controller for Policy: Using a Center of Gravity Model and a Constraint Condition
*Seiji ISHIHARAHarukazu IGARASHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

A typical fusion of fuzzy inference and reinforcement learning uses a value-based method such as Q-learning assuming Markov decision process. On the other hand, we have proposed a fusion of fuzzy inference and policy gradient method, which is a policy-based and learns a policy directly, unlike a value-based method. The fusion uses a stochastic policy defined by Boltzmann distribution having an objective function consisting of the product-sum operation for membership functions and rule weights. Moreover, we proposed another objective function by using defuzzification based on a center of gravity model weighted stochastically and a constraint condition on the vibration of the output. In this study, we applied the fusion to simulations on speed control of an automobile and compared the objective functions. The results showed that the policies learned by our method, which uses center of gravity model and a constraint condition, tended to suppress vibration of the speed and accomplish the control task with a small number of steps.

Content from these authors
© 2018 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top