2023 Volume 4 Issue 2 Pages 66-74
Automatic structural evaluation is crucial, particularly in post-disaster scenarios. While significant progress has been made in Image Captioning, its potential as a tool for structural damage assessment has not been thoroughly explored. Image captioning offers the ability to generate descriptive captions that aid in further analysis, and decision-making processes. This study focuses on developing an image captioning model designed for structural damage images and compares four popular convolutional neural networks (CNNs), namely VGG16, ResNet50, InceptionV3, and EfficientNet. Interestingly, all evaluated models performed very well in generating captions for structural damage images. However, InceptionV3 showcased a slight edge over the other models. This highlights its excellent caption generation ability for structural damage evaluation. Furthermore, while variations in training times were observed among the CNN models, it is noteworthy that during practical applications, the differences in processing times for caption generation were found to be negligible. The findings of this study underscore the effectiveness of different CNN models for image captioning in the context of structural damage evaluation. Moreover, it emphasizes the potential of image captioning as a valuable tool in automated structural evaluation. The study also calls for further research to enhance the accuracy, efficiency, and interpretability of automatic structural evaluation using image captioning approaches.