抄録
Humanoid robots are expected to understand human behaviors and generate human-like motions in order to interact with human partners. Research on imitative learning for motion primitives makes it possible for robots to recognize performers' behaviors. The learning framework is being extended to natural language. The robots can interpret behaviors as their corresponding sentences. This framework can be applicable to the robots under the laboratory conditions where a small number of behaviors only have to be recognized. it needs to be scaled to a large number of motion primitives and language expressions such that the robot can be integrated into our daily life. This paper describes a novel approach to establish a crowd-sourcing system through which any human annotators can attach descriptive sentences to behaviors. A large dataset of the behaviors and the descriptive sentences forms associative mapping between them. The mapping enables a robot to understand observation in the linguistic forms. We tested it on captured human motions and demonstrated the validity of it.