IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
An Integrated Convolutional Neural Network with a Fusion Attention Mechanism for Acoustic Scene Classification
Pengxu JIANGYue XIECairong ZOULi ZHAOQingyun WANG
著者情報
ジャーナル フリー

2023 年 E106.A 巻 8 号 p. 1057-1061

詳細
抄録

In human-computer interaction, acoustic scene classification (ASC) is one of the relevant research domains. In real life, the recorded audio may include a lot of noise and quiet clips, making it hard for earlier ASC-based research to isolate the crucial scene information in sound. Furthermore, scene information may be scattered across numerous audio frames; hence, selecting scene-related frames is crucial for ASC. In this context, an integrated convolutional neural network with a fusion attention mechanism (ICNN-FA) is proposed for ASC. Firstly, segmented mel-spectrograms as the input of ICNN can assist the model in learning the short-term time-frequency correlation information. Then, the designed ICNN model is employed to learn these segment-level features. In addition, the proposed global attention layer may gather global information by integrating these segment features. Finally, the developed fusion attention layer is utilized to fuse all segment-level features while the classifier classifies various situations. Experimental findings using ASC datasets from DCASE 2018 and 2019 indicate the efficacy of the suggested method.

著者関連情報
© 2023 The Institute of Electronics, Information and Communication Engineers
前の記事 次の記事
feedback
Top