Artificial Intelligence and Data Science
Online ISSN : 2435-9262
Text region detection and LMM-based digit recognition in scanned drawings
Chinami FUKUIChang WANGZiwen LANGuang LIAkihiro TAMURATsuyoshi HANADAKeisuke MAEDASho TAKAHASHITakahiro OGAWAMiki HASEYAMA
Author information
JOURNAL OPEN ACCESS

2025 Volume 6 Issue 2 Pages 173-178

Details
Abstract

In recent years, efficient infrastructure management has become increasingly important due to a decline in the number of engineers. However, structural drawings required for inspections have not been integrated into a centralized database, highlighting the need for an efficient method to convert scanned drawings into CAD data. This study proposes a method for text recognition in drawings by integrating a text detection model (FCENet) with a Large Multimodal Model (LMM, GPT-4o) to facilitate CAD conversion. Experimental results demonstrate that the proposed method, which first detects digit locations using the text detection model and then inputs individual text detection results into the LMM while minimizing the influence of background noise and unnecessary lines, reduces the burden on the LMM to infer digit positions. This approach enables more stable and accurate text recognition. Furthermore, updates to the LMM model play a crucial role in improving text recognition accuracy in drawings, and future adoption of more advanced models is expected to further enhance accuracy.

Content from these authors
© 2025 Japan Society of Civil Engineers
Previous article Next article
feedback
Top