論文ID: e24.53
Reducing the burden of data collection is crucial for advancing speech recognition research. Hence, this research focuses on exploring methods to enhance machine learning from limited data by augmenting the training data based on three-dimensional measurements in the field of Japanese silent speech recognition. We compared the connectionist temporal classification losses during training and the recognition performance with and without key data augmentation techniques to evaluate the effectiveness of the proposed method utilizing the direct linear transformation method. In this case, the deep neural network was trained successfully, resulting in a reduced phoneme error rate.