土木文書へのマルチモーダルモデル適用へ向けた調査

緒方 陸; 大久保 順一; 藤井 純一郎

doi:10.11517/pjsai.JSAI2024.0_3F1GS1001

Abstract

The digitalization of information such as specifications and inspection information, which were previously managed on paper, is now in progress for the purpose of improving the work efficiency and labor saving of civil engineers. On the other hand, many documents in the civil engineering field are in pdf format and come in a variety of formats. In some cases, scanned data of old documents are used as references, which cannot be handled by text extraction tools or optical character recognition (OCR) technology. In recent years, multimodal models have been used for OCR and document understanding, and it is expected that multimodal models will be used in the civil engineering field as well. In this study, we measure how well multimodal models can recognize and understand documents in the field of civil engineering that contain many technical terms and are written in Japanese. We also conduct a qualitative analysis and discuss the possibility of using multimodal models in the field of civil engineering.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!