-
Takayuki OKATANI
2025 Volume 6 Issue 2 Pages
1-24
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Since 2016, the Infrastructure Management Robotics Technology Team at RIKEN has been engaged in research on the application of artificial intelligence (AI) to infrastructure maintenance and management. Over this period, AI has undergone remarkable advancements, particularly with the emergence of deep learning and foundation models. Foundation models are large-scale neural networks pretrained on vast amounts of web data, characterized by their high generalization capability and ability to be adapted to specific tasks post-training. At the core of these models are large language models (LLMs), which predict subsequent words based on the contextual understanding of given text. Recently, a paradigm shift has been observed toward reasoning models that generate responses through step-by-step logical inference. As AI systems continue to improve, they have demonstrated the ability to acquire a significant portion of the explicit knowledge articulated by humans and made publicly available online. Nevertheless, even state-of-the-art multimodal AI systems remain limited in their ability to handle tacit knowledge, embodied cognition, and intuition—forms of knowledge that are inherently difficult to verbalize—highlighting key challenges that must be addressed in future research.
View full abstract
-
Naoya CHIBA
2025 Volume 6 Issue 2 Pages
25-41
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Neural Fields model the "function values corresponding to each point in space" using neural networks and describe shapes as functions that represent surfaces. One of the surface models using Neural Fields, DeepSDF, not only represents smooth surfaces but also models complex shapes with a simple network, revolutionizing 3D representations. However, there were limitations. The volumetric model NeRF, which was introduced later, made it possible to generate images from arbitrary viewpoints using only multi-view images. Its optical phenomena are modeled with density and radiance, allowing for changes in radiance based on direction and enabling the handling of scenes involving scattering, such as smoke. The network structure takes coordinates and viewpoint angles as input and outputs density and color. However, as queries increase in space, it becomes computationally intensive, and through various techniques such as spatial decomposition and splitting NeRF, fast learning and inference were achieved. Finally, an applied research example that the author worked on collaboratively will be introduced.
View full abstract
-
Hiroharu KATO
2025 Volume 6 Issue 2 Pages
42-50
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
3D data is useful not only for design but also for digital twins, entertainment, and other applications. However, the barrier to creating 3D data using 3D modeling tools is high, and the resources required are significant. To address this, 3D capture and 3D model generation technologies are being explored. Among these, 3D capture, which mechanically captures real-world spatial information and converts it into 3D data, has seen significant improvements in quality through methods inspired by deep learning. Both deep learning-based image recognition and photogrammetry have succeeded not by ’going through several independently designed modules to reach the output,’ but by ’directly focusing on the desired output and optimizing the recognition model/3D model,’ which has been the key to their success. The technology of generating new 3D models through linguistic instructions is an extension of image generation but is still not fully mature, and performance improvements are expected in the future.
View full abstract
-
Ryota OKAUCHI, Pang-jo CHUN
2025 Volume 6 Issue 2 Pages
51-61
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
High-accuracy point cloud segmentation is essential for understanding the 3D situation of construction sites. However, acquiring large-scale and diverse annotated point cloud datasets required for training deep learning models from real-world construction sites is extremely difficult due to cost and labor constraints. To address this challenge, this study proposes a method leveraging a Unity-based virtual environment simulator (OperaSim-PhysX) to efficiently and automatically generate diverse scenes including various terrains and construction machinery. A large-scale supervised synthetic point cloud dataset was constructed by capturing RGB and depth images within the virtual environment and automatically creating annotation maps by synthesizing mask images for each object. A PointNet++ model was trained using only this synthetic dataset. To evaluate its applicability to real environments, the model was applied to real-world point cloud data of a construction site obtained via drone-based Structure from Motion (SfM). The experimental results indicate that the model trained solely on virtual data shows potential for recognizing the types and locations of construction machinery in real-world point clouds to some extent, especially exhibiting promisingly high recall for heavy machinery (low miss rate). However, challenges arising from the domain gap between virtual and real environments (Sim-to-Real gap), such as color similarity with the background and detection accuracy for small objects poorly reconstructed by SfM, were also clarified. Based on these findings, directions for future improvements to enhance accuracy are discussed.
View full abstract
-
Kenta ITAKURA, Takuya HAYASHI, Yoshito SAITO, Pang-jo CHUN
2025 Volume 6 Issue 2 Pages
62-72
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In this study, we developed a method leveraging sensor fusion technology that combines camera images and LiDAR point clouds obtained using Matterport for efficient inspection of rust in conduit tunnels. Conventional methods relying solely on point cloud data showed difficulty in detecting rust due to its minimal surface irregularities and shape changes. In contrast, high-precision rust detection was achieved by utilizing differences in color and texture from camera images, enabling the estimation of rust locations within tunnels. By integrating information from images and LiDAR, it became possible to calculate values such as rust location and area, which are difficult to estimate from images alone. The sensor fusion approach accurately estimated the distance per image pixel, achieving a mean absolute error of 1.3×10−3 m and a mean absolute percentage error (MAPE) of 6.7%.
View full abstract
-
Takumi MURAI, Kenta ITAKURA, Riku MIYAKAWA, Sota KUDO, Kosuke MURAISHI ...
2025 Volume 6 Issue 2 Pages
73-84
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Since tofu quality and waste rates depend on coagulation conditions, technologies for predicting viscosity during the coagulation process are needed. This study focused on changes in light scattering during the coagulation process of tofu, which is a colloidal food. Laser scattering images were sequentially collected under various coagulation conditions to analyze these changes. To estimate the viscosity from the obtained images, regression models using pre-trained convolutional neural networks were constructed. Additionally, Long Short-Term Memory (LSTM) models were implemented to learn the temporal dependencies in the coagulation process, as tofu coagulation is a time-dependent phenomenon. Our model achieved an RMSE of 3.38 mPa·s, demonstrating the effectiveness of viscosity estimation using laser scattering images over the first 5 minutes after coagulation began. These results suggest potential contributions to quality improvement and waste reduction through real-time feedback of tofu coagulation status.
View full abstract
-
Tetsu KATO, Aiko FURUKAWA, Tomohiro TAKEICHI
2025 Volume 6 Issue 2 Pages
85-94
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In bridge cable maintenance, a method for indirectly estimating tension from natural vibration characteristics obtained using Fast Fourier Transform (FFT) has been widely adopted. However, to achieve reliable cable maintenance, improving the estimation accuracy of natural vibration characteristics is essential, which necessitates the selection of an appropriate estimation method. This study focuses on the Frequency Domain Decomposition (FDD) method, which estimates natural vibration characteristics using only structural response data, and evaluates its estimation accuracy in comparison with the FFT method. Through validation based on cable experiments, it was confirmed that the FDD method provides higher accuracy in estimating mode shapes, particularly for higher-order modes. Furthermore, using the estimated natural vibration characteristics, cable tension and bending stiffness were determined and compared with true and design values. The results showed that the FDD method achieved superior accuracy in estimating both cable tension and bending stiffness. From these findings, it was confirmed that replacing the FFT method with the FDD method improves the estimation accuracy of cable natural vibration characteristics, leading to enhanced accuracy in estimating cable parameters.
View full abstract
-
Takafumi KITAOKA
2025 Volume 6 Issue 2 Pages
95-101
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
The business applications of large language models (LLMs) have been expanding rapidly, with new technologies being introduced continuously. Recently, Retrieval-Augmented Generation (RAG) has gained attention, requiring an understanding of vector databases for effective utilization. However, for individuals without specialized technical expertise, the cognitive burden can be high, making practical implementation challenging. In this study, we focus on a subset of soil classification names used in geotechnical engineering and applied geology. We employ BERT to visualize the relationships among these terms by mapping them in three dimensions. Additionally, ChatGPT-4o and ChatGPTo3-mini-high were used to infer the meanings of the three axes, prompting AI to reconsider the relationships incorporating geotechnical knowledge. As a result, we propose a method that enables dynamic updates to soil classification correlations without incurring high development costs.
View full abstract
-
Tahiro SEKI, Takafumi KITAOKA, Kazuo SAKAI, Shuntarou MIYANAGA
2025 Volume 6 Issue 2 Pages
102-107
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In recent years, the application of AI to mountain tunnel face evaluation has been expanding, with particular attention given to classification using face images. However, class imbalance in training data poses a challenge, resulting in reduced classification accuracy. In this study, to address class imbalance, we utilized generative AI to introduce weighted cross-entropy and trained an Artificial Neural Network (ANN). As a result, improvements in classification accuracy were observed for certain evaluation criteria compared to conventional methods, suggesting the effectiveness of data correction using generative AI. On the other hand, accuracy degradation was noted for specific evaluation items, indicating room for improvement. This study demonstrates the potential of generative AI as a method for addressing class imbalance and presents a new approach for in-house AI development.
View full abstract
-
Risa IIDA, Kohei NAGAI
2025 Volume 6 Issue 2 Pages
108-119
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Bridge deterioration in Japan is advancing, and in Hokkaido, about 20,000 bridges are municipally managed under limited personnel and budgets. This study uses GIS to analyze spatial deterioration trends based on inspection data and examines the impact of geographical, environmental, and social factors on bridge health. The results show that "year of construction, road management municipality, and superstructure (material used)" are key deterioration factors, with trends varying by bridge type. Notably, RC bridges are more affected by municipal financial capacity and staff numbers. As they are often small and infrequently used, municipalities with tight budgets may delay repairs and maintenance.
View full abstract
-
Kenta HAKOISHI, Daisuke SUGETA, Masayuki HITOKOTO, Hiroyuki FURUKI, At ...
2025 Volume 6 Issue 2 Pages
120-127
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In recent years, the technology of large language models (LLMs) has been advancing, but there are chal lenges such as the lack of knowledge in specific fields. To address this issue, retrieval-augmented generati on (RAG) has been gaining attention. Many RAG technologies are realized through embedding representa tion models, but there are challenges in customization and cost-effectiveness for specific fields such as civ il engineering. In this study, we constructed customizable models and small-scale models for adaptation to the civil engineering field through contrastive learning on OpenAI’s text-embedding-3-large. The accurac y verification results confirmed the effectiveness of the models in adapting to the civil engineering field a nd enabling small-scale, fast computation.
View full abstract
-
Sanae KAN, Satoshi YAMANAKA, Keita MATSUMOTO
2025 Volume 6 Issue 2 Pages
128-137
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Mountain tunnel excavation work basically involves repeating the same tasks.For this reason,managing cycle time,which is the working time per cycle,is extremely important in improving productivity.However,measuring cycle time is a time-consuming process.Therefore,the authors developed a method to automatically calculate it using AI.In order to improve the AI’s judgment accuracy, it is necessary to have it learn a certain amount of information from multiple work sites,but the amount of learning required is unknown.In this paper,the authors verified and considered the relationship between the amount of learning and judgment accuracy in blast excavation and mechanical excavation,which are typical excavation patterns for mountain tunnels.The verification results showed that by increasing the amount of learning,the overall judgment accuracy improved in blasting excavation,but did not improve much in mechanical excavation.So there’s room for improvement in the judgment method.
View full abstract
-
Rina SAKAYA, Kotaro SASAI, Kiyoyuki KAITO
2025 Volume 6 Issue 2 Pages
138-149
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In Japan, bridge deterioration is progressing, and local governments are facing challenges such as a shortage of engineers and financial constraints. Therefore, it is necessary to make effective use of limited resources and determine repair priorities based on statistical deterioration prediction. In this study, we adopt the concept of regional infrastructure group management, specifically focusing on wide-area collaboration, which has gained attention in recent years. By integrating bridge inspection data from multiple management entities and applying a mixed Markov deterioration hazard model, we estimate the expected service life for each management entity. Furthermore, when the condition assessment criteria differ among management entities, data integration becomes challenging. To address this issue, we propose a hidden Markov deterioration hazard model that formulates the relationship between condition ratings before and after integration, thereby enabling data integration and achieving more accurate deterioration prediction.
View full abstract
-
Satoshi KUBOTA, Riichi SASAI, Yoshihiro YASUMURO
2025 Volume 6 Issue 2 Pages
150-158
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In the urban space, projects are underway to link real and cyber spaces with the aim of creating a digital twin. However, there are few efforts to accumulate data for digital twin in the construction field. In this study, with the aim of constructing a construction space digital twin, we aimed to grasp the position coordinates of workers in a three-dimensional space and visualize them in three dimensions, targeting workers’ position information that has been visualized in two dimensions so far. We used three beacons to estimate the worker’s position coordinates and constructed a system to display them in the three-dimensional point cloud data. Experiments conducted in a simulated indoor construction environment verified the functionality of the proposed method, and showed that the three beacons can perform positioning and visualization in a three-dimensional space.
View full abstract
-
Natsumi SHINCHI, Kento FUKUZAWA, Kohei NAGAI
2025 Volume 6 Issue 2 Pages
159-172
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In Hokkaido, in addition to social challenges such as population decline, aging society and financial difficulties, the aging deterioration of infrastructure has become severe. Efficient management is urgently needed with insufficient capital for maintenance, but the situations vary among municipalities with many factors to consider, making it difficult to develop uniform countermeasures. Therefore, this research aimed to establish a methodology that could propose effective maintenance and management guidelines adapted to different municipal situations by classifying and analyzing municipalities quantitatively and comprehensively incorporating diverse elements. We categorized 179 municipalities in Hokkaido using many open data which show current social conditions and infrastructure conditions. Through visualizing this result and data analysis, we clarified different current situations and challenges for each municipal group, including degrees of social difficulty, characteristics of maintenance and management systems, and impacts on surrounding areas. These results are expected to advance the sharing of challenges among municipalities that have similar situations and serve as quantitative indicators for policy formulation by national and prefectural governments.
View full abstract
-
Chinami FUKUI, Chang WANG, Ziwen LAN, Guang LI, Akihiro TAMURA, Tsuyos ...
2025 Volume 6 Issue 2 Pages
173-178
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In recent years, efficient infrastructure management has become increasingly important due to a decline in the number of engineers. However, structural drawings required for inspections have not been integrated into a centralized database, highlighting the need for an efficient method to convert scanned drawings into CAD data. This study proposes a method for text recognition in drawings by integrating a text detection model (FCENet) with a Large Multimodal Model (LMM, GPT-4o) to facilitate CAD conversion. Experimental results demonstrate that the proposed method, which first detects digit locations using the text detection model and then inputs individual text detection results into the LMM while minimizing the influence of background noise and unnecessary lines, reduces the burden on the LMM to infer digit positions. This approach enables more stable and accurate text recognition. Furthermore, updates to the LMM model play a crucial role in improving text recognition accuracy in drawings, and future adoption of more advanced models is expected to further enhance accuracy.
View full abstract
-
Masami ABE, Takumi TAKAHASHI, Akihiro KOBAYASHI, Katsumi TAKAYAMA, Shi ...
2025 Volume 6 Issue 2 Pages
179-187
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
The Physics-Informed Diffusion Model (PIDM) is a surrogate model that utilizes diffusion models to generate numerical simulation results with physical law constraints. In scenarios with abundant numerical simulation data, PIDM shows promise for accurately representing complex phenomena such as eddies. Video generation using diffusion models faces several constraints, including: (a) the substantial memory requirements necessitating high-performance computing resources, and (b) the necessity for specialized training mechanisms, such as guidance, to produce results that align with observations.
In this study, we propose "Physics-Informed Repaint," a novel method that enables the learning and generation of videos aligned with observed data without relying on any pre-trained foundational models. This is achieved by: (i) integrating "Repaint," a technique that facilitates video generation consistent with observations without the need for pre-training guidance, into a diffusion model; and (ii) developing a method to impose Physics-Informed loss constraints solely during the generation phase. Notably, our proposed method requires only the training of an unconditional diffusion model, given that the computational results are available.
We validated the effectiveness of our proposed method for interpolating an 8-day period of missing observations within the highly nonlinear region of the Kuroshio-Oyashio Current Extension, characterized by numerous eddies, using computational results from a hydrodynamic model (DREAMS). Demonstrating the feasibility of our approach, we achieved current prediction on a consumer-grade, single-board GPU machine for a sequence of computational results with approximately 100,000 mesh elements. Furthermore, we analyzed the role of the Physics-Informed loss in our model.
View full abstract
-
Riku OGATA, Junichi OKUBO, Masahiro OKANO, Junichiro FUJII
2025 Volume 6 Issue 2 Pages
188-198
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Large Language Models (LLMs) are known to have poor response performance in highly specific domains such as civil engineering. Methods such as Retrieval Augmented Generation (RAG) have been proposed to address this problem and are increasingly being used in civil engineering. On the other hand, there may be problems that cannot be solved by current RAG, such as when the number of target documents is very large. Recently, there have been an increasing number of examples of LLM-based agents in various fields, their use is also expected in the field of civil engineering, e.g. for exhaustive knowledge search. In this paper, a comparative analysis of a RAG and an LLM-based agents is performed to evaluate their application. The results on this paper show that although the RAG method showed better performance, the LLM-based agent method was able to refer to more appropriate parts of documents than the RAG method, indicating its potential use in civil engineering. Finally, we summarised the issues that need to be addressed when using LLM-based agents.
View full abstract
-
Daisuke TAJIMA, Ryota ISHIZU, Izuru KURONUMA, Kenniti HAMAMOTO, Yutaka ...
2025 Volume 6 Issue 2 Pages
199-211
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Research and development of next-generation automated construction systems centered on automation technology for construction machinery is underway. Among these, automated bulldozers perform the task of spreading materials that have been unloaded by large dump trucks. By understanding the shape and volume of the unloaded materials, further improvements in the accuracy of the spreading operation can be expected. When measuring the unloaded materials from sensors installed on an automatic bulldozer, it was difficult to accurately determine the shape and volume of the materials. This is because there is a large quantity of materials, and they are shaped like a mountain, making it impossible for the automatic bulldozer to measure the rear part of the materials. Therefore, we developed a method to estimate the shape and volume of unloaded materials using deep learning, and the results of precision verification and demonstration experiments in construction site showed an average height error of 0.049m for the shape and a volume error of 5%, confirming its usefulness.
View full abstract
-
Kotaro ASANO, Yu OTAKE
2025 Volume 6 Issue 2 Pages
212-220
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
Machine learning technology has developed rapidly in recent years, and has made great achievements in the field of civil engineering. Among them, Recurrent Neural Networks (RNNs) are highly effective in learning time-series data by storing the latent dynamic behavior of the system internally, and are expected to be applied to soil seismic response. On the other hand, such deep learning models lack interpretability, and there are issues in evaluating their reliability. In this study, we apply operator interpretation based on Dynamic Mode Decomposition with Control (DMDc) to RNNs, decompose and visualize the operators obtained by expanding the RNN model into modes, and use the operators generated by DMDc for learning, aiming for RNNs to acquire modes with better properties. The results of the study show that the goal of improving interpretability can be achieved while maintaining the same level of prediction performance compared to the case where DMDc is not considered.
View full abstract
-
Ryuto YOSHIDA, Junichi OKUBO, Junichiro FUJII
2025 Volume 6 Issue 2 Pages
221-229
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In the field of computer vision, Multiple Object Tracking (MOT) methods have been extensively studied. These studies typically evaluate tracker performance using benchmark datasets such as MOT17. In field studies, the use of MOT for automating traffic surveys is also becoming more common. While benchmark evaluations provide a reference point for selecting a tracker in field applications, the characteristics of videos captured for such surveys often differ from those in benchmark datasets. As a result, the performance reported in MOT studies may not always generalize well to field applications. To address this issue, we construct a dataset with ground truth annotations following the MOT Challenge format using video footage exclusively captured in sidewalk environments, specifically for automated traffic surveys. Using this dataset, we evaluate the performance of trackers in sidewalk environments. Furthermore, based on the evaluation results, we analyze the sources of errors and explore potential improvements to enhance tracking accuracy.
View full abstract
-
Yuta TAKAHASHI, Airu TAKASE, Keigo MATSUSHIMA
2025 Volume 6 Issue 2 Pages
230-236
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In civil engineering, constructing a digital twin requires a large amount of data. Outputs derived from data science, including AI, are reproduced by associating them with coordinate information in physical space. Since various data sources tied to coordinate information must be accurate, modifying and supplementing the digital twin by cross-verifying data using multiple methods should be considered. This study tries improving an previous verification method. Increasing the variety of road signs to be detected in the images and the relation between position of traffic sign and road edge is defined by azimuth. They improve the estimation of the restricted areas in previous study.
View full abstract
-
Yushi TOMOEDA, Gakuho WATANABE, Elfrido Elias Tita, Shogo NISHINO
2025 Volume 6 Issue 2 Pages
237-246
Published: 2025
Released on J-STAGE: July 25, 2025
JOURNAL
OPEN ACCESS
In recent years, the use of satellite technologies such as GNSS and SAR has attracted increasing attention as a method for displacement measurement in infrastructure structural health monitoring (SHM) utilizing ICT and IoT. Although the removal of noise from observational data remains a critical challenge, these technologies enable continuous, long-term, and three-dimensional deformation monitoring, offering significant promise for bridge maintenance and early anomaly detection.
This study focuses on a five-span continuous curved steel box girder bridge and investigates three-dimensional displacements measured using RTK-GNSS. A noise reduction technique based on Kalman filtering was developed to improve measurement accuracy, and the resulting data were used to analyze structural deformation behavior. Notably, the GNSS data revealed abnormal deformation trends during the summer, prompting the development of a detection mechanism to identify such anomalies in real time. This paper presents the methodology, implementation, and evaluation of the proposed system, demonstrating the potential of GNSS-based monitoring for advanced bridge lifecycle management.
View full abstract