Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
The digitalization of information such as specifications and inspection information, which were previously managed on paper, is now in progress for the purpose of improving the work efficiency and labor saving of civil engineers. On the other hand, many documents in the civil engineering field are in pdf format and come in a variety of formats. In some cases, scanned data of old documents are used as references, which cannot be handled by text extraction tools or optical character recognition (OCR) technology. In recent years, multimodal models have been used for OCR and document understanding, and it is expected that multimodal models will be used in the civil engineering field as well. In this study, we measure how well multimodal models can recognize and understand documents in the field of civil engineering that contain many technical terms and are written in Japanese. We also conduct a qualitative analysis and discuss the possibility of using multimodal models in the field of civil engineering.