A small humanoid which understands and executes commands in a multi-modal language through microphones and sensors in real time was developed. In usability tests, 30 subjects unfamiliar with the language were able to operate the humanoid without frustration and complete their tasks by talking to it, using gestures and touching it without a long learning stage. Multi-modal commands were more successful than spoken commands without non-verbal messages.