2022 Volume 47 Issue 301 Pages 27-36
Earth-to-air heat exchangers (EAHEs) utilize the heat capacity of soil to pre-cool or pre-heat outside air. However, condensation can build up within an EAHE, and can potentially lead to air pollution. It is necessary to optimize the control of EAHE in the operational phase. In this study, we focused on reinforcement learning (RL) control. In our previous study, we proposed a Deep Q-Network (DQN) based control method and showed its effectiveness as an operational control for EAHEs. DQN is a value-based algorithm that learns the value function Q to find the policies. In contrast, a policy-based algorithm learns policies directly, without learning a value function. However, the operation control of an EAHE using a policy-based algorithm has not been examined. The purpose of this study is to establish the optimal control rules for an EAHE using proximal policy optimization (PPO), which is a policy-based RL algorithm. First, we define the control problem for RL using the environment estimated by a long-term performance prediction method of EAHE based on CFD. Then, we implement PPO. We verify the effectiveness of PPO by comparing random control with DDQN. The following results were obtained. 1) The number of learning iterations required to converge was about 200 for PPO and 150 for DDQN. At the end of the study, the sum of rewards was about -2,000 for DDQN and -1,500 for PPO. 2) Compared with random control and DDQN, PPO achieved the highest control performance in both energy saving and condensation control.