IEEJ Transactions on Electronics, Information and Systems
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<Multimedia Technology>
Finding the Correspondence of Audio-Visual Events by Object Manipulation
Kento NishiboriYoshinori TakeuchiTetsuya MatsumotoHiroaki KudoNoboru Ohnishi
Author information

2008 Volume 128 Issue 2 Pages 242-252


A human being understands the objects in the environment by integrating information obtained by the senses of sight, hearing and touch. In this integration, active manipulation of objects plays an important role. We propose a method for finding the correspondence of audio-visual events by manipulating an object. The method uses the general grouping rules in Gestalt psychology, i.e. “simultaneity” and “similarity” among motion command, sound onsets and motion of the object in images. In experiments, we used a microphone, a camera, and a robot which has a hand manipulator. The robot grasps an object like a bell and shakes it or grasps an object like a stick and beat a drum in a periodic, or non-periodic motion. Then the object emits periodical/non-periodical events. To create more realistic scenario, we put other event source (a metronome) in the environment. As a result, we had a success rate of 73.8 percent in finding the correspondence between audio-visual events (afferent signal) which are relating to robot motion (efferent signal).

Information related to the author
© 2008 by the Institute of Electrical Engineers of Japan
Previous article Next article