Abstract
In this work, we introduce macro-actions, which are defined by useful sequences of agent's actions leading to high rewards, into a reinforcement learning algorithm. If such macro-actions are extracted and utilized effectively, it is expected that the learning would be getting speed up by restricting the search space.