The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec)
Online ISSN : 2424-3124
2020
Session ID : 1P2-L03
Conference information

Vision and Auditory Fusion System Using Deep Learning
Kazuto TSUMURA*Futoshi KOBAYASHIHiroyuki NAKAMOTO
Author information
CONFERENCE PROCEEDINGS RESTRICTED ACCESS

Details
Abstract

Recently, sensing technology has been dramatically developed. Along with this, a wide variety of sensors have been used in a system such as automated driving technology and the robot technology. When humans recognize the environment, the information of five senses are transmitted to the sensory area in the cerebrum and processed. After that, the processed information are transmitted to the association area and fusion. Also in the robot sensor fusion, it is expected such a human sensor fusion system. In this study, we propose the system that word recognition is fusion by combining Visual Data from a RGB camera and Voice Data from a microphone using CNN that can automatically extract features. We train the network using Visual Data and Voice Data, and verify the accuracy of the system by the word recognition rate, and the possibility of sensor fusion by deep learning.

Content from these authors
© 2020 The Japan Society of Mechanical Engineers
Previous article Next article
feedback
Top