Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 3F1-GS-10-01
Conference information

Exploration for the adaptation of multimodal models to civil engineering documents.
*Riku OGATAJunichi OKUBOJunichiro FUJII
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

The digitalization of information such as specifications and inspection information, which were previously managed on paper, is now in progress for the purpose of improving the work efficiency and labor saving of civil engineers. On the other hand, many documents in the civil engineering field are in pdf format and come in a variety of formats. In some cases, scanned data of old documents are used as references, which cannot be handled by text extraction tools or optical character recognition (OCR) technology. In recent years, multimodal models have been used for OCR and document understanding, and it is expected that multimodal models will be used in the civil engineering field as well. In this study, we measure how well multimodal models can recognize and understand documents in the field of civil engineering that contain many technical terms and are written in Japanese. We also conduct a qualitative analysis and discuss the possibility of using multimodal models in the field of civil engineering.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top