Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 3Win5-07
Conference information

Dual Reinforcement Learning for Satisficing Levels in Target-Oriented Exploration
*Wataru NAKAMURATatsuji TAKAHASHIYu KONO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

When humans begin a new endeavor, they initially focus on acquiring basic skills and progressively advance to intermediate and advanced levels. In essence, the focus is on achieving a goal rather than optimizing from the outset. Based on this idea, we decompose reinforcement learning into two processes: goal-oriented exploration and stepwise goal adjustment. Our algorithm, Risk-sensitive Satisficing (RS), quickly achieves satisficing by minimizing a subjective regret defined by the goal. RS also dynamically optimizes the goal in bandit problems, matching Thompson Sampling performance without requiring prior knowledge. While this demonstrates the usefulness of decomposing reinforcement learning into two key elements, current RS goal adjustment methods remain limited to bandit problems. In this study, we propose a general goal adjustment algorithm based on reinforcement learning for motor control. By integrating two simple reinforcement learning processes - rapid goal attainment and one-dimensional goal optimization - we successfully operationalize the concept of a goal.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top