2025 Volume 29 Issue 6 Pages 1484-1499
Fine-grained image classification plays a crucial role in various applications, such as agricultural disease detection, medical diagnosis, and industrial inspection. However, achieving a high classification accuracy while maintaining computational efficiency remains a significant challenge. To address this issue, in this study, enhanced DetailNet (EDNET), a convolutional neural network (CNN) model designed to balance fine-detail preservation and global context understanding, was developed. EDNET integrates multiscale attention mechanisms and self-attention modules, enabling it to capture both local and global information simultaneously. Extensive ablation studies were conducted to evaluate the contribution of each module and EDNET was compared with the mainstream benchmark models ResNet50, EfficientNet, and vision transformers. The results demonstrate that EDNET achieves highly competitive performance in terms of accuracy, F1-score, and area under the receiver operating characteristic curve, while maintaining an optimal balance between parameter count and inference efficiency. In addition, EDNET was tested in both high-performance graphics processing unit (NVIDIA RTX 3090) and resource-constrained environments (Jetson Nano simulation). The results confirm that EDNET is deployable on edge devices, achieving an inference efficiency comparable to that of EfficientNet, while outperforming traditional CNN models in fine-grained classification tasks.
This article cannot obtain the latest cited-by information.