Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
When humans begin a new endeavor, they initially focus on acquiring basic skills and progressively advance to intermediate and advanced levels. In essence, the focus is on achieving a goal rather than optimizing from the outset. Based on this idea, we decompose reinforcement learning into two processes: goal-oriented exploration and stepwise goal adjustment. Our algorithm, Risk-sensitive Satisficing (RS), quickly achieves satisficing by minimizing a subjective regret defined by the goal. RS also dynamically optimizes the goal in bandit problems, matching Thompson Sampling performance without requiring prior knowledge. While this demonstrates the usefulness of decomposing reinforcement learning into two key elements, current RS goal adjustment methods remain limited to bandit problems. In this study, we propose a general goal adjustment algorithm based on reinforcement learning for motor control. By integrating two simple reinforcement learning processes - rapid goal attainment and one-dimensional goal optimization - we successfully operationalize the concept of a goal.