Reinforcement Learning (RL) handles policy search problems: searching a mapping from state space to action space. However RL is based on gradient methods and as such, cannot deal with problems with multimodal landscape. In contrast, though Genetic Algorithm (GA) is promising to deal with them, it seems to be unsuitable for policy search problems from the viewpoint of the cost of evaluation. Minimal Generation Gap (MGG), used as a generation-alternation model in GA, generates many offspring from two or more parents selected from a population. Therefore, evaluating policies of generated offspring requires much trial and error (i.e. interaction between an agent and an environment). In this paper, we incorporate importance sampling into the framework of MGG in order to reduce the cost of evaluation on policy search. The proposed techniques are applied to Markov Decision Process (MDP) with multimodal landscape. The experimental results show that these techniques can reduce the number of interaction between an agent and an environment, and also mean that MGG and importance sampling are good for each other.
In this paper, we report the development of a design knowledge management system, called DDMS (Design Documentation Management System). By composing design documents during design, DDMS encourages a designer to externalize his/her knowledge and facilitates sharing and reuse of such externalized design knowledge in later stages. DDMS works as a front end to KIEF (Knowledge Intensive Engineering Framework), which we have been developing over years. DDMS is capable of guiding designers with design process knowledge based on a model of synthesis, combined together with KIEF that can integrate multiple design object models, and maintain consistency among these models. DDMS automatically generates design documents after analyzing design log data based on the model of synthesis. We also illustrate an example of laser lithography design to demonstrate the features of DDMS.
Scheduling has been an important research field in Artificial Intelligence. Because typical scheduling problems could be modeled as a Constraint Satisfaction Problem(CSP), several constraint satisfaction techniques have been proposed. In order to handle the different levels of importance of the constraints, solving a problem as a Weighted Maximal Constraint Satisfaction Problem(W-MaxCSP) is an promising approach. However, there exists the case where unexpected events are added and some sudden changes are required, i.e., the case with dynamic changes in scheduling problems. In this paper, we describe such dynamic scheduling problem as a Dynamic Weighted Maximal Constraint Satisfaction Problem(DW-MaxCSP) in which constraints would changes dynamically. Generally, it is undesirable to determine vastly modified schedule even if re-scheduling is needed. A new schedule should be close to the current one as much as possible. In order to obtain stable solutions, we propose the methodology to maintain portions of the current schedule using the provisional soft constraints, which explicitly penalize the changes from the current schedule. We have experimentally confirmed the efficacy of re-scheduling based on our method with provisional constraints. In this paper, we construct the nurse scheduling system for applying the proposed scheduling method.
We propose a multi-agent system which learns intervention policies and evaluates the effect of interventions in an artificial foreign exchange market. Izumi et al. had presented a system called AGEDASI TOF to simulate artificial market, together with a support system for the government to decide foreign exchange policies. However, the system needed to fix the amount of governmental intervention prior to the simulation, and was not realistic. In addition, the interventions in the system did not affect supply and demand of currencies; thus we could not discuss the effect of intervention correctly. First, we improve the system so as to make much of the weights of influential factors. Thereafter, we introduce an intervention agent that has the role of the central bank to stabilize the market. We could show that the agent learned the effective intervention policies through the reinforcement learning, and that the exchange rate converged to a certain extent in the expected range. We could also estimate the amount of intervention, showing the efficacy of signaling. In this model, in order to investigate the aliasing of the perception of the intervention agent, we introduced a pseudo-agent who was supposed to be able to observe all the behaviors of dealer agents; with this super-agent, we discussed the adequate granularity for a market state description.
In this paper, we developed a Web-based video annotation system, named iVAS (intelligent Video Annotation Server). Audiences can associate any video content on the Internet with annotations. The system analyzes video content in order to acquire cut/shot information and color histograms. And it also automatically generates a Web page for editing annotations. Then, audiences can create annotation data by two methods. The first one helps the users to create text data such as person/object names, scene descriptions, and comments interactively. The second method facilitates the users associating any video fragments with their subjective impression by just clicking a mouse button. The generated annotation data are accumulated and managed by an XML database connected with iVAS. We also developed some application systems based on annotations such as video retrieval, video simplification, and video-content-based community support. One of the major advantages of our approach is easy integration of hand-coded and automatically-generated (such as color histograms and cut/shot information) annotations. Additionally, since our annotation system is open for public, we must consider some reliability or correctness of annotation data. We also developed an automatic evaluation method of annotation reliability using the users' feedback. In the future, these fundamental technologies will contribute to the formation of new communities centered around video content.
This paper proposes an aggregation pheromone system (APS) for solving real-parameter optimization problems using the collective behavior of individuals which communicate using aggregation pheromones. APS was tested on several test functions used in evolutionary computation. The results showed APS could solve real-parameter optimization problems fairly well. The sensitivity analysis of control parameters of APS is also studied.