Reinforcement Learning (RL) methods are very hopeful because they can learn useful behavior based on rewards from environment by trial and error. This paper tackles more difficult problems than the ones tackled by many ordinary RL methods: RL in POMDP (Partially Observable Markov Decision Process) environments with multiple rewards.
Evolutionary algorithms (EAs) are optimization methods and are based on the concept of natural evolution. Recently, growing interests has been observed on applying estimation of distribution techniques to EAs (EDAs). Although probabilistic context free grammar (PCFG) is a widely used model in EDAs for program evolution, it is not able to estimate the building blocks from promising solutions because it takes advantage of the context freedom assumption. We have proposed a new program evolution algorithm based on PCFG with latent annotations which weaken the context freedom assumption. Computational experiments on two subjects (the royal tree problem and the DMAX problem) demonstrate that our new approach is highly effective compared to prior approaches including the conventional GP.
In particle swarm optimization (PSO) algorithms there is a delicate balance to maintain between exploitation (local search) and exploration (global search). When facing multimodal functions, the standard PSO algorithm often converges to a local minimum quickly, missing better opportunities. Methods such as non-global best neighborhoods increase exploration, but at the expense of slowing the convergence of the whole PSO algorithm. In this paper, we propose a new method to extend PSO, velocity-based reinitialization (VBR). VBR is both simple to implement and effective at enhancing many different PSO algorithms from the literature. In VBR-PSO, the velocities of the particles are monitored throughout the evolution, and when the median velocity of the swarm particles has dropped below a threshold, the whole swarm is reinitialized. Through VBR, the problem of premature convergence is alleviated; VBR-PSO focuses on one minimum at a time. In our experiments, we apply VBR to the global-best, local best, and von Neumann neighborhood PSO algorithms. Results are presented using the standard benchmark functions from the PSO literature. VBR enhanced PSO yields improved results on the multimodal benchmark functions for all PSO algorithms investigated in this study.
In the recent materials research, much work aims at realization of ``functional materials'' by changing structure and/or manufacturing process with nanotechnology. However, knowledge about the relationship among function, structure and manufacturing process is not well organized. So, material designers have to consider a lot of things at the same time. It would be very helpful for them to support their design process by a computer system. In this article, we discuss a conceptual design supporting system for nano-materials. Firstly, we consider a framework for representing functional structures and manufacturing processes of nano-materials with relationships among them. We expand our former framework for representing functional knowledge based on our investigation through discussion with experts of nano-materials. The extended framework has two features: 1) it represents functional structures and manufacturing processes comprehensively, 2) it expresses parameters of function and ways with their dependencies because they are important for material design. Next, we describe a conceptual design support system we developed based on the framework with its functionalities. Lastly, we evaluate the utility of our system in terms of functionality for design supports. For this purpose, we tried to represent two real examples of material design. And then we did an evaluation experiment on conceptual design of material using our system with the collaboration of domain experts.
This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.