Host: Japan Society for Fuzzy Theory and Intelligent Informatics (SOFT)
Name : 37th Fuzzy System Symposium
Number : 37
Location : [in Japanese]
Date : September 13, 2021 - September 15, 2021
An automatic minute generating system based on speech recognition technology has possibility to significantly improve efficiency of speech recoding in meetings and business. Because the system prefer to apply the high-quality audio as its voice source, it is necessary to prepare the usage environment according to the number of participants before events. Moreover, the impact of surrounding voice without human voice is the main factor results in the low of speech-recognition accuracy. On the other hand, the lip motion accompanying speaking for speech can retain the speech- specific features even in a noisy environment. It means that using lip movements to estimate speech contents will contribute to the improvement of speech recognition accuracy. Our research discussed a method of speech recognition correction using image information for the purpose of improving the speech-recognition accuracy on the automatic minute generating system. In this study, we discussed a method of speech content estimation using lip movements and lip color information.