Microwave heating has been gradually extended to industrial material process from domestic microwave ovens because of its substantial advantages such as high-efficiency, pollution free, and selective heating. Unfortunately, the drawback of the temperature non-uniformity, which may cause thermal runaway, becomes an obstruction for the development of microwave energy. Besides, a common problem associated with microwave heating systems is that the speed of microwave power transmission is faster than the temperature detection period. Thus, to ensure the global temperature uniformity and to enhance the system adaptivity for deviation of the temperature detecting position in the microwave heating system with input constraints, a multi-rate simple adaptive multi-point temperature control strategy based on almost strictly positive real conditions is proposed, where the use of multi-rate sampling and lifting technique is to solve the case that the system has less inputs than outputs. Finally, simulation results demonstrate the effectiveness of the proposed control strategy.
In this paper, we propose a design method of self-interference cancelers for in-band full-duplex wireless relaying taking account of baseband signal subspaces. We model the relaying system with self-interference as a sampled-data feedback control system. Then we formulate the design problem as a sampled-data H∞ control problem with a generalized sampler and a generalized hold. The problem can be reduced to a discrete-time ℓ2-induced norm optimization problem by explicitly considering the subspace spanned by baseband signals. Moreover, for implementation, we also adopt ideal uniform samplers and zero-order holds with digital filters and up/down samplers. Under these implementation constraints, we reformulate the problem as a standard discrete-time H∞ control problem by using the discrete-time lifting technique. Simulation results are shown to illustrate the effectiveness of the proposed method.
Profit sharing (PS) is well known as a kind of reinforcement learning. In a PS method, a reward is generally distributed with a geometrically decreasing function, and the common ratio of the function is called a discount rate. A large discount rate increases the learning speed, but a non-optimal policy may be learned. On the other hand, a small discount rate improves the performance of the policy, but the learning may not proceed smoothly because of the shallow learning depth. In this paper, in order to cope with these problems, we propose a method that reinforces both the detour path and the non-detour path with different discount rates. Finally, this method is applied to a maze problem and an altruistic multi-agent environment to confirm its effectiveness.
This paper proposes a multi-agent reinforcement learning method without communication toward dynamic environments, called profit minimizing reinforcement learning with oblivion of memory (PMRL-OM). PMRL-OM is extended from PMRL and defines a memory range that only utilizes the valuable information from the environment. Since agents do not require information observed before an environmental change, the agents utilize the information acquired after a certain iteration, which is performed by the memory range. In addition, PMRL-OM improves the update function for a goal value as a priority of purpose and updates the goal value based on newer information. To evaluate the effectiveness of PMRL-OM, this study compares PMRL-OM with PMRL in five dynamic maze environments, including state changes for two types of cooperation, position changes for two types of cooperation, and a combined case from these four cases. The experimental results revealed that: (a) PMRL-OM was an effective method for cooperation in all five cases of dynamic environments examined in this study; (b) PMRL-OM was more effective than PMRL was in these dynamic environments; and (c) in a memory range of 100 to 500, PMRL-OM performs well.
Pole-zero cancellation is a well-known and important concept in linear time-invariant systems. In contrast, transfer functions as well as poles and zeros are not defined for linear time-varying systems. In this paper, we attempt to generalize the concept of pole-zero cancellation to linear time-varying systems. We first introduce the new concept of extended transfer-function for linear time-varying systems in the time domain instead of the frequency domain. We then propose the computational procedure of pole-zero cancellation to linear time-varying systems. We finally discuss the meaning of the proposed computational procedure regardless of the lack of poles and zeros in linear time-varying systems. The proposed concept and computational procedure are illustrated by a numerical example.