Artificial Intelligence and Data Science
Online ISSN : 2435-9262
Automatic Generation of Structured Data from Borehole Logs Using Vision-Language Models
Masataka SHIGA
Author information
JOURNAL OPEN ACCESS

2026 Volume 7 Issue 1 Pages 133-142

Details
Abstract

This study proposes a method for automatically generating structured data from borehole log images that do not contain embedded text information, using Vision-Language Models (VLMs). While conventional OCR technology can recognize characters in images, it has limitations in understanding the complex tabular structure specific to borehole logs and associating geological layer information with test values. Our method employs a two-phase VLM processing approach (schema element selection + YAML extraction) using the Google Gemini API to generate XML DTD (Document Type Definition)-compliant structured data represented in YAML directly from images. We evaluated the effects of model selection and image resolution on extraction accuracy using 10 borehole datasets (12 pages) obtained from the Hokuriku Ground Information System. Experimental results confirmed that when using the Gemini 3 Pro model, the F1 score for geological layer extraction was 95.0%, the F1 score for SPT depth matching was 79.3%, the N-value exact match rate was 80.8%, and the coordinate match rate was 90.0%. These results demonstrate that automated structuring of borehole log images is achievable with practical accuracy.

Content from these authors
© 2026 Japan Society of Civil Engineers
Previous article Next article
feedback
Top