Artificial Intelligence and Data Science
Online ISSN : 2435-9262
Damage type and level estimation of road attachment facilities using vision transformers and large vision-language models
Koshi WATANABEKeisuke MAEDARen TOGOTakahiro OGAWAMiki HASEYAMA
Author information
JOURNAL OPEN ACCESS

2025 Volume 6 Issue 3 Pages 966-975

Details
Abstract

Road attachment facilities, including road signs and lighting, are ubiquitous across vast road networks, making efficient inspection crucial. Previously, AI models were proposed to classify the damaged type of road attachment facilities. However, practical implementation requires an interpretable framework and the ability to estimate damage level. This paper proposes a comprehensive framework based on the damage type classification with Vision Transformer and the damage level estimation with the in-context learning of the large vision-language models (VLMs). The ViT-based damage type classification provides an interpretable framework, while the LVM’s in-context learning enables damage level estimation, a challenging task for ViTs alone. In the last part of this paper, we evaluate our method with real-world images of road attachment facilities.

Content from these authors
© 2025 Japan Society of Civil Engineers
Previous article Next article
feedback
Top