2018 年 2018 巻 AGI-009 号 p. 02-
In reinforcement learning, it is difficult to solve high-dimensional multi-objective decision problem. In this paper, we propose a novel hierarchical architecture of deep reinforcement learning with Accumulator Based Arbitration Mode(ABAM). This architecture has some groups which have deep Q-network (DQN) modules respectively and layers containing these modules to solve high-dimensional multi-objective decision problem. And ABAM arbitrates the outputs of modules in one layer by using the outputs of modules in the above layer which has more abstractive objective. By using this architecture, it is able to solve difficult maze task which simple DQN cannot solve.