In this study, we choose reinforcement learning as a method of acquiring a robot's behavior autonomously and aim at walking behavior acquisition with a gecko-type, four-legged robot with a waist joint. The setting of the reward function has great influence on the learning result because the aim of the agents in reinforcement learning is to maximize the total acquisition reward. We performed an experiment in which a straight line was set to the target orbit and an error between the generated orbit and the target orbit was added to the reward function, and examined how design of the reward function influenced the generated orbit. As a result of the experiment, our robot acquired efficient walking with a waist joint, so that error between the generated orbit and the target orbit was minimized while moving forward.