Proceedings of the Fuzzy System Symposium
40th Fuzzy System Symposium
Session ID : 1G1-2
Conference information

proceeding
A Study on Action Selection Probability Model Considering Selection Bias in Reinforcement Learning-Based Behavior Modeling
*Yuki MurayamaKeiichi HorioRyosuke Kubota
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In this study, we propose an action selection probability model that takes into account the case where there is a certain bias in the action to be selected in reinforcement learning-based action modeling. In the proposed action selection probability model, the softmax function is shifted in parallel when calculating the action selection probability, assuming that factors other than reward influence the selection of actions. Specifically, parallel shift is achieved by adding a certain bias to the difference of action values in each state for calculating the action selection probability. In the proposed method, this bias value is determined based on maximum likelihood estimation in addition to the learning rate and inverse temperature in conventional reinforcement learning models, respectively. In order to confirm the effectiveness of the proposed method, we artificially generated data that is likely to take a certain action independent of the reward using a two-armed bandit problem, which is a type of benchmarking, and compared the likelihood of each model in the conventional and proposed methods using this data. The results showed that the likelihood of the proposed method was significantly higher than that of the conventional method.

Content from these authors
© 2024 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top