Abstract
Our research group has been developing on-line behavioral operation technologies that enable humanoid robots to perform tasks in human environment integrating speech recognition, object recognition using 3D vision and online whole-body motion generation technologies. This paper tackles this integration problem by addressing the issues of representing knowledge of actions which facilitates natural language instructions for tasks in indoor human environments. We propose a lexicon of basic actions and behaviors in this preliminary attempt to construct a reliable and flexible natural language instruction system. We describe the implementation of the proposed online behavioral operation system on our humanoid robot HRP-2, which is able to detect the direction of a speaker from within 2 meters and receive natural language instructions from the user using microphone arrays connected to a speech recognition embedded system on-board the robot.