Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Humans have a tendency in decision-making called satisficing: they stop exploring more when they find an option above a criterion (aspiration level). Risk-sensitive Satisficing (RS) model is a value function that enables efficient non-random exploration and realizes satisficing in reinforcement learning (Tamatsukuri & Takahashi, 2019). To apply RS to continuous state spaces, we extended RS to Linear RS (LinRS) for function approximation and test its performance in the contextual bandit problems. As a result, it was found that the algorithm had better performance in probabilistic environments than the existing algorithms. Also, it was found that the aspiration level needed to be corrected because of the approximation error.