Data augmentation method based on three-dimensional measurement for silent speech recognition

Kenko Ota

doi:10.1250/ast.e24.53

抄録

Reducing the burden of data collection is crucial for advancing speech recognition research. Hence, this research focuses on exploring methods to enhance machine learning from limited data by augmenting the training data based on three-dimensional measurements in the field of Japanese silent speech recognition. We compared the connectionist temporal classification losses during training and the recognition performance with and without key data augmentation techniques to evaluate the effectiveness of the proposed method utilizing the direct linear transformation method. In this case, the deep neural network was trained successfully, resulting in a reduced phoneme error rate.

著者関連情報

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license.
https://creativecommons.org/licenses/by-nd/4.0/

お気に入り & アラート

閲覧履歴

前身誌

Journal of the Acoustical Society of Japan (E)

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）