主催: 一般社団法人 日本機械学会
会議名: IIP2025 情報・知能・精密機器部門講演会講演論文集
開催日: 2025/03/03 - 2025/03/04
The purpose of this study is to establish a decoding technique to estimate sounds heard by humans from fMRI images using deep learning. The sounds we hear in usually have a distinctive timbre. The timbre is determined by the combination of sound pressure levels of the overtones in compound tones, i.e., the frequency spectrum. Previous studies suggested that tonotopy, which shows a specific pattern of activation in the auditory cortex, is influenced by the frequency spectrum. In this report, we focus on tonotopy and estimate timbres from fMRI images using deep learning. Four types of timbres were prepared, and four pitches differing by two timbres each were learned by two classifications to create six classifiers. When the classifiers were used for estimation, the maximum estimation rate for untrained data was 67.22%, and the average estimate rate for the four timbres was 45.31%, far exceeding the chance level of 25.00% for four classifications. This suggests that the proposed estimation method is useful. In addition, issues remained for unlearned data including unlearned pitches.