IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
ACSTNet: An Attention Cross Stage Transformers Network for Small Object Detection in Remote Sensing Images
Yang LIUJialong WEIShujian ZHAOWenhua XIENiankuan CHENJie LIXin CHENKaixuan YANGYongwei LIZhen ZHAO
Author information
JOURNAL FREE ACCESS

2025 Volume E108.A Issue 4 Pages 582-596

Details
Abstract

Deep learning based object detection methods have achieved promising performance recently. However, these methods lack sufficient capabilities to handle satellite images owing to the fact that small-sized objects in remote sensing images are difficult to detect. To address this issue, we propose a novel small object detection method based on YOLO X named Attention Cross Stage Transformers Network (ACSTNet). Specifically, a novel backbone network, Multi-scale Cross Fusion Network (MCFNet) is constructed to capture semantic dependencies between pixels over long distances and increase the depth-interaction information at different levels. Meanwhile, a new feature fusion layer is added to the upper feature output layer of dark3, allowing the model to maximize the retention of low-level features of small objects and to locate them more accurately. Furthermore, to address the problem of the inaccurate feature extraction caused by overlapping and occlusion of dense objects, we propose an efficient channel and space normalized fusion attention mechanism (ECSNFAM), which is composed of channel attention, space attention, and batch normalization attention branches, using residual structure to enhance the sensitivity of the attention mechanism for small targets. Experiments are conducted to evaluate the performance of the general remote sensing dataset, and the results show that our proposed method improves the mean Average Precision (mAP) by 1.2% and 1.4% on the DIOR and the RSOD-DATA datasets compared with the YOLO X. The source code is available at https:github.com/Wei-JL/ACSTNet.git.

Content from these authors
© 2025 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top