Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
32nd (2018)
Session ID : 1N1-04
Conference information

Analysis of cognitive satisficing value function
Guaranteed satisficing and finite regret
*Akihiro TAMATSUKURITatsuji TAKAHASHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

As the domains of reinforcement learning become more complicated and realistic, standard optimization algorithms may not work well. In this paper we introduce a simple mathematical model called RS (reference satisficing) that implements a satisficing strategy that look for actions with values above the aspiration level. We apply it to K-armed bandit problems. If there are actions with values above the aspiration level, we theoretically show that RS is guaranteed to find these actions. Also, if the aspiration level is set to an ''optimal level'' so that satisficing practically ends up optimizing, we prove that the regret (the expected loss) is upper bounded by a finite value. We confirm these results by simulations, and clarify the effectiveness of RS through comparison with other algorithms.

Content from these authors
© 2018 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top