2022 Volume 142 Issue 2 Pages 149-150
Strömbom et al. elucidated an algorithm in which a sheepdog can skillfully control a flock of sheep to guide them to a destination. This is called the Herding Algorithm, and it models the behavior of a sheepdog in two ways: “driving”, which guides a flock of sheep to a destination, and “collecting”, which brings the sheep together into one flock. In this model, Go et al. showed that an agent (sheepdog) could herd a flock of sheep with an inference model generated by reinforcement learning (RL). However, in their previous study, RL learned only the movement behavior to the positions at which the agent performs “driving” and “collecting” in the discretized environmental state and behavioral space. In this study, we have assumed a continuous environmental state and behavioral space. We have confirmed that even if the agent's herding behavior is the learning target, the proposed inference model generated by deep RL can herd sheep.
The transactions of the Institute of Electrical Engineers of Japan.C
The Journal of the Institute of Electrical Engineers of Japan