抄録
This paper introduces a method to reduce the occurrence of transfer loops on the network for the routing algorithm based on the reinforcement learning scheme. In former study, we proposed the routing algorithm “DARLA” which fulfills the traffic shaping of the network. However, our algorithm couldn't suppress the traffic using the links toward the source of packets. This is due to the property of the characteristic of reinforcement learning. We show the method to estimate the lower bound of the probability for valid route for the destination, and this lower bound can be used to reject the routes which cause the transfer loop.