IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508

この記事には本公開記事があります。本公開記事を参照してください。
引用する場合も本公開記事を引用してください。

An integrated convolutional neural network with a fusion attention mechanism for acoustic scene classification
Pengxu JIANGYue XIECairong ZOULi ZHAOQingyun WANG
著者情報
ジャーナル フリー 早期公開

論文ID: 2022EAL2091

この記事には本公開記事があります。
詳細
抄録

In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

著者関連情報
© 2023 The Institute of Electronics, Information and Communication Engineers
feedback
Top