Abstract
This paper proposes a synthesis method of a supervisor based on a reinforcement learning. In discrete event systems, a supervisor controls disabling of controllable events to satisfy control specifications given by formal languages. However a precise description of the specifications is needed to construct the supervisor. In the proposed method, the specifications are given by rewards, and the optimal supervisor is derived under uncertain environments by considering rewards for occurrence of events and control patterns through learning. By computer simulation, we examine an efficiency of the proposed method.