主催: 一般社団法人 日本機械学会
会議名: 2024年度 年次大会
開催日: 2024/09/08 - 2024/09/11
Human Machine Interfaces (HMIs) have seen significant advancements recently, especially in aiding communication for patients with amyotrophic lateral sclerosis (ALS) and quadriplegia due to nerve damage. Among these, HMIs utilizing tongue movements have been proposed as a method leveraging relatively intact body parts. Specifically, weak electromyogram (EMG) signals during tongue motion serve as an effective input interface due to their higher signal-to-noise ratio compared to brain waves, allowing for stable measurements and reducing user burden. However, estimating consonants from EMG signals remains challenging. This study aims to develop a speech recognition system enabling non-vocal communication, utilizing EMG signals measured around the hyoid bone and leveraging advanced deep learning techniques. The proposed system integrates Convolutional Neural Networks (CNNs) for vowel estimation and Long Short-Term Memory (LSTM) networks for word prediction. The experimental methods include a detailed system overview, explanations of CNN and LSTM architectures, and a comparison of label estimation accuracy using different CNN parameters. Additionally, sequence data prediction using LSTM models is described. The experimental results demonstrate that combining the vowel estimation CNN with the word estimation LSTM yields a highly generalized model, enabling efficient and accurate non-vocal communication for patients. This HMI software provides a promising solution for enhancing patient communication capabilities.