Proceedings of the Fuzzy System Symposium
41th Fuzzy System Symposium
Session ID : 1D1-4
Conference information

proceeding
Building the Appropriate Causal Graph to Improve Reward Estimation Models
*Mariko SugimuraIchiro Kobayashi
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

In the field of off-policy evaluation, methods have been proposed that use reward estimation models learned from data to predict rewards in unobserved domains. However, the data used for learning depends on the action selection probabilities of the policy used to obtain the data, and the model’s prediction accuracy may deteriorate due to selection bias. This is because variables that influence policy action selection also influence the results, leading to spurious correlations caused by confounding factors that are reflected in the prediction model. Therefore, this study aims to construct a reward estimation model based on causal relationships rather than correlation-based prediction models. As the first step, we constructed a causal graph from real data using the Peter-Clark algorithm, one of the causal exploration methods. Additionally, we analyzed the constructed causal graph and explored methods for applying it to reward estimation models.

Content from these authors
© 2025 Japan Society for Fuzzy Theory and Intelligent Informatics
Previous article Next article
feedback
Top