This paper presents a novel model of RL agents. A feature of our learning agent model is to integrate the Analytic Hierarchy Process (AHP) into the standard RL agent model, which consists of three modules: state recognition, learning, and action selecting modules. In our model, the AHP module is designed with {\\it primary knowledge} that humans intrinsically have in a process until a goal state is attained. This integration aims at increasing promising actions instead of completely random actions in the standard RL algorithms.
Profit Sharing (PS) is adopted as a RL method for our model, since PS is known to be useful even in multi-agent environments. To evaluate our approach in a multi-agent environment, we test a PS RL method with our agent model on a pursuit problem in a grid world. Computational results show that our approach outperforms the standard PS in terms of learning speed in the earlier stages of learning. We also show that the learning performance of our approach is superior at least competitive to that of the standard one in the final stages of learning.
By merging the characteristics of both KODAMA and VPC functions, we propose a new framework for ubiquitous computing environments. It provides distributed management functions according to the concept of agent communities, agent communications which are abstracted from the physical environment, and agent collaboration with policy packages.
Using our new framework, we conducted a large-scale experiment in shopping malls in Nagoya, which sent advertisement e-mails to users' cellular phones according to user location and attributes. The empirical results showed that our new framework worked effectively for sales in shopping malls.