Host: Japan Society for Fuzzy Theory and Intelligent Info rmatics (SOFT)
Name : 41th Fuzzy System Symposium
Number : 41
Location : [in Japanese]
Date : September 03, 2025 - September 05, 2025
While collaborative reinforcement learning by multiple agents is effective in improving learning efficiency, when all agents are learning in several different environments, it is necessary to explore different optimal policies for each environment. In our previous research, the effectiveness of a switching model has been demonstrated for bandit problems, where agent clustering and cluster-wise Q learning are simultaneously performed. This research tries to extend the previous model to handle State-Action Q-table and demonstrate its advantages.