2023 Volume 4 Issue 3 Pages 12-20
Recently, we have witnessed the rapid development of artificial intelligence technology, of which deep learning is a representative example. It is used in a wide range of fields, and its performance has become known. Particularly, the development of supervised learning, such as deep learning, has progressed at a breathtaking pace, and it is also expected to be utilized in the civil engineering field. However, its progress there has lacked the steady development seen in other fields. In this study, we aim to indicate areas that should be addressed in the field of civil engineering in the future. Moreover, using the field of infrastructure maintenance and management as an example, and with a particular focus on data, this study presents the outlook for data platforms in terms of data consolidation and storage, measurement automation for data collection, and ways of linking data and knowledge. This paper is the English translation from the authors’ previous work [Chun, P. J. (2020). "A.I. in civil engineering: a roadmap for research and development." Artificial Intelligence and Data Science, 1(J1), 9-15. (in Japanese)].
The decrease in the working population has been an issue in the construction industry in recent years, and there are high hopes for artificial intelligence (AI) technology as a means of addressing the issue of a lack of human resources and improving production1). Particularly, the development of supervised learning such as deep learning has progressed at a breathtaking pace, and in field of the civil engineering, there have been many studies related to structures, especially bridges, that investigate various aspects, such as the detection of cracks in concrete2)-9), understanding of deterioration factors10), detecting internal damage11)-13), detection of damage in asphalt pavement14)-16), assessment of steel corrosion17)-20), reflection of damage in 3D models21)-24), and vibration measurement25)-31). Various other studies have also been conducted, and some of these studies have reached the stage of practical implementation. For example, there are studies utilizing documents accumulated in civil engineering32),33), water level prediction of rivers and dams34)-38), evaluation of landslides from satellite images and aerial photographs39),40), exploration of buried objects 41)-45), and construction site investigation45)-49).
Deep learning generally refers to the development of an inductive model based on a learning process that uses huge volumes of data. Particularly, when the data are comprehensive, considering all possible boundary conditions, and the input and output formats are well-defined, the effectiveness of deep learning lies in its ability to achieve highly accurate results through precise interpolation.
However, in the field of civil engineering, the amount of usable data is much smaller (probably several orders smaller) than the quantity required as inputs for deep learning. The reasons for this include the following:
· Civil engineering structures are unique products due to the differing environments in which they are constructed, and there are more significant individual differences and less generality than for factory-made products;
· The cost of measurement using sensors (particularly installation costs) is generally high, and as the measuring apparatus, measurement methods, and measurement distance are determined based on the discretion of the engineer, it is not possible to obtain similar types of data;
· There is abundant data that cannot be disclosed at the discretion of the bridge administrator;
· Data can be manipulated for various reasons, such as budgetary constraints for countermeasures;
· Experiments to recreate the actual phenomenon are inevitably expansive and demand huge costs.
For this reason, it is impossible to avoid the impact of the disadvantages of deep learning, including "vulnerability situations outside the scope of learning and an inability to respond flexibly to real-world situations." The studies mentioned at the beginning that have already been reported are the rare cases where a large amount of data is available. Cases rarely reach the practical level that these studies have reached. Additionally, machine learning methods are a black box, and as it is not always clear why a certain result has been produced, it is difficult to use it in scenarios where a high degree of reliability is sought. Based on the above, it is necessary to create a new type of mechanism or method to utilize deep learning and other AI technologies in the field of civil engineering. Hence, in this study, we propose the following methodology to solve the above shortcomings: a methodology that makes effective use of accumulated knowledge and facilitates understanding of the basis for judgment by
· Sharing, management, and effective utilization of data via data platforms;
· Increasing data volume and reducing costs with automated measurement through the implementation of AI;
· Linking interpretation and inference based on knowledge and rules with machine learning.
This section uses infrastructure maintenance management as an example and describes what needs to be addressed in the field of civil engineering.
As described in the previous chapter, the difficulty of obtaining existing data in the civil engineering field is an extremely significant problem from the perspective of data utilization.
This difficulty in acquiring data may also hold true for data held within the same institution. Particularly, data may be in a paper medium or a medium such as DVD-R and stored in geographically separated business offices. For example, the following exchange may occur:
[1] The need to analyze data may arise;
[2] There may be an inquiry made through telephone or email to each location regarding where the data is;
[3] Once the location in which the data exists is confirmed, a request may be made to send it;
[4] The party receiving the request may search for the data;
[5] Once it is discovered, it may need to be sent to the requestor;
[6] On receiving the data, it may still be insufficient;
[7] They once again may inquire where the missing data is (hereafter, steps from [3] will be repeated).
In this process, it takes at least an entire day for the necessary data to reach the analyzing side or can even end up taking up to one month.
If we want to achieve meaningful results with a data-driven approach, we need to conduct trial-and-error based on a wide variety of data types. BI tools (Business Intelligence tools), which have become popular in recent years, were designed from that very perspective, and by analyzing and visualizing data in real time, they can improve the accuracy and speed of decision making and bring about improvements and changes in business operations. Fig. 1 shows the example of the Microsoft Power BI. Considering this from the perspective of machine learning, the trial-and-error process of finding out what type of data can be used and what type of result can be produced is sought. However, if the above exchange is required every time we collect data, it will be very difficult to take a trial-and-error approach, and enthusiasm for utilizing the data will inevitably decline. As data is required to serve as the basis for decision making, it is easy to choose the approach of "first making a decision and then looking for data to justify the decision in an alibi-like manner." This approach (1) justifies an invalid form of decision making and (2) demands a lot of necessary work for exploring data. When considering this, it could be argued that this is even worse than groundless decision making.
Therefore, it is important to construct a data platform that can serve as the foundation for utilizing data through the collection, integration, and visualization of data and construct a system that can acquire the necessary data at the appropriate time. Particularly, when considering the development of the data platform and system from the perspective of AI enhancements in utilizing data platforms, the functions to be realized are as follows:
· The types of data that can be acquired are organized in a structure, such as in a data catalog, and metadata is attached to each individual data item;
· Considering that there will be mistakes made when entering the data within the data platform, AI should propose a correction or automatically correct the data;
· AI should automatically structure unstructured data based on the user’s objectives and then suggest what data types may be required;
· The API should be maintained so that the data can be simply used on the application side.
As a structure for implementing the above functionality, we can consider a structure like the one shown in Fig. 2. A data lake is a structure that is responsible for storing both structured and unstructured data, and a data warehouse is responsible for retrieving the data from the data lake, transforming the data, and storing it as structured data as well as providing computing resources and interfaces. There are many operations where only a data warehouse (or something similar) is set up and a data lake is not set up. It is often the case; however, information is irreversibly lost during the process of creating organized structured data, and later, it is unable to respond to changes in a data structure that are needed at that time, causing later regret. With a data lake, it first places all data in an unprocessed state so that it can respond to later changes in the data structure and prevent the state in which necessary data is lost. However, if it simply stores the data, the state in which unused data builds up will occur; therefore, it is necessary to store the data with a sense of governance. For this reason, it is necessary to apply data, which is information related to data. It is also important to maintain a data catalog to organize what metadata exists and how it is structured in the data warehouse.
If the data within the data platform is structured in this way and it is possible to gain appropriate metadata from the data catalog, a data-driven analysis design, including AI, is extremely easy. Further, if the information of "Who?", "When?", and "How?" is created in the metadata, an evaluation of the applicable range of the constructed AI and reliability is possible. If data with appropriate metadata can be acquired via API based on the use case, it is expected that users will be able to perform analysis without much preprocessing. Preprocessing is said to account for 80%-90% of the overall process.
Conversely, the use of AI within the data platform is also necessary for it to demonstrate high levels of performance. To construct a data warehouse from the data in a data lake, it is necessary to extract the appropriate information, structure it, and store it; however, this is not necessary for a simple process. Generally, this is called the extract, transform, and load process, and, for example, it is very difficult to extract data from documents that have been converted to PDF without uniform rules. This is especially true in the case of drawings that are represented as raster data. For that reason, AI is inevitably called upon for the processing of this data. Further, mistakes sometimes occur when creating the original data (e.g., recording 20 km for a bridge that is actually 20-m wide). If the metadata is properly organized, we should be able to denote a range that the data is likely to have. Therefore, it is important for AI to play a role in the data platform to issue warnings about data that is significantly outside of this range. Further, if the data in the data warehouse is structured, it is possible for AI to suggest data considered necessary for achieving its objectives. Thus, data platform construction and AI development are in a mutually reinforcing relationship, and the link between them is a very strong one. Although this topic has been discussed in other fields, it has not yet been discussed in the field of civil engineering, and as a unique structure that considers the unique data characteristics in civil engineering to be necessary, the appropriate collaboration of AI and data platforms is a topic that deserves further research in future works.
The last chapter discusses data platforms. However, a major issue at present is that the effort and cost of acquiring the data are too great. Consequently, there is often a lack of data, and the results of the data-driven analysis do not achieve the desired results. Lack of data is an issue that constantly accompanies data-driven approaches in the civil engineering field.
One way of compensating for the lack of data is data sharing among multiple researchers and institutions through the use of data platforms; however, in many cases, the data will still be insufficient. We believe that resolving this problem requires a reduction in the effort and cost of acquiring and collecting a large amount of data. One way of doing this is measurement automation through the use of robot technology.
For example, inspection automation using an unmanned aerial vehicle (UAV) is a much sought-after technology when inspecting bridges. However, currently, it is not a simple process for a UAV to fly and take a comprehensive range of images without colliding with bridges, because global navigation satellite system signals cannot be received under bridges. To resolve this problem, it is considered effective to utilize the data platform described in the previous chapter and use the 3D model of the bridge obtained in this way as a map for the UAV.
For example, if a 3D BIM model is created during the design stage and is saved on a data platform, the API can be expected to obtain a 3D model of the bridge. However, most bridges only have 2D CAD drawings, and in this type of case, it is necessary to convert the data from the 2D drawings to 3D drawings. The authors are currently promoting research and development into performing this type of data conversion with AI, and once this is achievable, even if only 2D CAD drawings exist, it will be possible to generate 3D models and use these as maps for the UAV.
The framework for UAV bridge self-inspection conceived of by the authors when using a 3D model as a map is shown in Fig. 3. First, the map for the autonomous flight (3D model) is acquired from the data platform using the API ([1] in Fig. 3) and based on the map, a route is automatically generated, based on which the UAV can be flown and the bridge photographed. Then, the data is transferred to the data platform ([2]-[4] in Fig. 3). A simple diagnosis of the damage is then performed using AI, such as a deep learning model, and areas requiring more detailed inspection are screened ([5] in Fig. 3). The UAV then acquires the data via the API, and the UAV takes a close-up image of the area requiring detailed inspection, and this image is transferred to the data platform ([6] and [7] in Fig. 3). Finally, a detailed diagnosis of the damage is performed once more using deep learning. Further, the state of the bridge is automatically diagnosed, and the data is stored ([8] in Fig. 3). Further learning is then performed with AI based on the stored data, and the performance is improved.
An example of bridge inspection using a UAV is outlined as follows. First, the robot is controlled based on the model stored on the data platform. Then, the robot acquires data and performs a judgment based on AI. Subsequently, AI issues additional instructions to the robot based on the judgment results, which then collects additional data and makes additional judgments using AI. The acquired data is then stored in a data platform for further learning of the AI. Hence, using AI, the automation of data acquisition by the robot will achieve high performance and the data acquisition cost will decrease. Further, as a large amount of data is stored on the data platform, the performance of AI will increase further. This type of positive spiral is not limited to the combination of bridges and UAVs, and several patterns can be considered; thus, we are currently searching for effective patterns of utilization.
At any measurement, the issue of how to accumulate data in different dimensions through measurement automation is extremely important from the viewpoint of being able to utilize AI in the civil engineering field.
Chapter 2 discusses the data platform required to fully utilize the obtained data, and Section 3 discusses measurement automation for greatly strengthening the quality and quantity of data. It is considered that this will accelerate the learning and utilization of supervised machine learning. Moreover, it is not easy to describe physical phenomena using only inductive methods such as supervised learning. Therefore, in addition to an inductive method that builds up rules and models from the data in a bottom-up fashion, such as deep learning, it is necessary to combine top-down knowledge and rules with deductive mechanisms that interpret and reason based on the situation and context. As the field of civil engineering has a long history and an abundance of deductive tools, this approach seems to be particularly suited here.
For example, at Japan’s Public Works Research Institute (PWRI), a project started in 2018 involving "joint research related to improving the efficiency of road and bridge maintenance using AI." One of the topics covered by this research is how to develop diagnostic AI. This incorporates the logical thinking of an experienced inspection engineer as well as foundational information into a rule-based expert system. Expert systems represented the second AI boom in the 1980s; however, the systems crashed due to the difficulty of dealing with the frequent exceptions that occurred during daily life. However, if we consider the diagnostic AI scope of the application to be "providing support to engineers," it is not necessary for diagnostic AI to suppress all exceptional events. Particularly, in regard to low-risk exceptional events, it is considered that compared to the advantages of utilizing diagnostic AI, the damage caused by overlooking cases is not so severe. Conversely, high-risk exceptional events need to be controlled by the system and prevented from being considered as exceptions. However, when it comes to diagnosing bridges, it is not necessarily difficult to cover the risk of falling bridges in the system to the extent that the damage is not overlooked. Additionally, as diagnostic AI is rule-based, there is the advantage that the user can easily understand the grounds AI uses to make a judgment, and this is useful when it comes to the process of nurturing engineers.
On the contrary, creating new rules that include diagnostic AI needs meticulous testing and analysis and data acquisition, with an approach requiring years to complete. Particularly, its development will always tend to be delayed. Further, if a rule happens to be wrong and even if the data suggest that it is wrong, AI may miss the timing where it can correct the rule if the given rule is treated as being perfect. Additionally, although large-scale approaches are being considered to diagnose AI for bridges, it is not realistic to take this approach entirely, including, for example, nondiagnostic work such as inspections or other infrastructure. There are expectations for the proposal of a system that can combine an inductive approach, such as supervised learning, and a deductive approach, such as a rule-based system, to resolve these types of issues, thereby covering the exceptions described above and deriving accurate results.
There is not yet any consensus on a specific methodology; however, research is being vigorously promoted in the field of information engineering. However, the assumption is that, in addition to the rules given by humans, in expert systems, there will be a need to establish a methodology to understand causal relationships and make appropriate inferences from the data. The structure of the brain at both the anatomical and functional levels has received rigorous investigation in the field of neuroscience. Based on this, it is known that the structure of the human brain is a combination of widely distributed specialized modular structures and their connecting edges. The graph structure is a similar mathematical structure, and the use of a graph database based on this marks an important point in the construction of a cause-and-effect system.
Many researchers and institutions think this way, and, for example, research on semantic networks, such as Semantic Nets in 1956, have been pursued since many years ago50), and similar research are still being conducted today. For example, Google proposed and constructed the Knowledge Graph51), and as of 2012, it had stored more than 18 billion causal relationships covering over 570 million objects and has been using them in its searches. To this day, active research and development and practical applications are being promoted. Fig. 4 shows an example of knowledge graph application to maintenance and management.
The figure only uses provisional data; therefore, to explain the data, we confirmed a circular relationship in which corrosion damage was observed in steel girder 0101 (0101 refers to the element number), and this damage was considered to be caused by a reduction in the bridge’s load-carrying capacity. As a result, cracks occurred in floor slab 0102, which caused a water leak. This in turn led to further corrosion in steel girder 0101. If this type of circular relationship is observed between objects, it can cause a huge increase in the computational time required for relational databases. Additionally, if we try to express this relationship, it is necessary to incorporate a highly complex expression within the program. On the contrary, with a graphed database, this can be expressed precisely and the time required for computation does not increase excessively. If this type of data can be collected on a large scale, it promises to allow the most appropriate repair method to be proposed, considering the likelihood that these arrows will occur, their degree of influence, and that degradation will occur again after repair, etc. This also promises the ability to predict degradation with high accuracy by providing edge properties such as the degree of influence and time function until the realization of said influence. The method of calculating the graph database is still in its evolutionary stage; however, the development of graph convolutional neural networks52),53) is taking place in various locations, and there are high hopes for its utilization.
Although this type of research and development has barely started in the field of civil engineering, we feel that the field of civil engineering is a domain in itself, and it is therefore important for us to be proactive in our efforts. For example, with deep learning, currently represented by CNN, domain-free development is currently occurring, and while this is natural from the standpoint of information engineering, it may be one reason why the data conditions required by deep learning are not a perfect match in the field of civil engineering. Hence, we have not been able to utilize the results as is. We believe that it is necessary to use such reflections, and developing and communicating the system based on our domain can proactively broaden the scope of application toward the direction of the civil engineering field.
By using these databases, we believe it will be possible to obtain information based on past cases by capturing photographs and surrounding materials. For example, Fig. 5 is a case researched by the authors, and a judgment of the damage was made based on the image. Based on this, by using image captioning technology, the text could be generated automatically40). Through the development of knowledge-linked AI, this can be further developed and the damage status can be evaluated, and it is expected that the system will develop to a level where it can search and derive recommended methods based on the possible causes of damage, past examples of countermeasures, and the evaluation of the same.
There are also other ways of linking knowledge and existing models. Established methods include data assimilation that fuses a model-driven approach and data-driven approach, which is used in the meteorological field. Further, work has also been started on machine learning algorithms that incorporate mathematical models with causal relationships based on the conservation law of energy and other physical laws, and data is being learned based on these algorithms. Moreover, we believe that it will be useful to explore the use of these and other applications in the civil engineering field in the future.
In this study, we discussed the outlook for bridge maintenance management as a subject, looking at management using data platforms, mass data acquisition through measurement automation, and knowledge linking as items that will be required in the future in the field of civil engineering.
It is difficult to say that the data in the civil engineering field are being fully utilized, and in many cases, they are only used as evidence that work is being performed. We feel that the management format makes the data difficult to handle, and this discourages people to use it. Conversely, the fact that the data is not used means that data that are useful for analysis are simply not collected, which leads to a negative cycle. However, as the utilization of data is a prerequisite for machine learning, it is necessary to break through this negative cycle. Therefore, we first discussed the method of managing data using data platforms and looked at examples of measurement automation. Additionally, rather than just functional methods such as machine learning technology, we also examined deductive data, such as conventional knowledge and experience, as well as model utilization and presented graph databases as an example of the same.
The methods discussed in this study are just examples, and this is a field still under development. While things may not be actually functioning yet, there is a high likelihood that the methods discussed in this study can be developed and implemented moving forward. However, unless civil engineering researchers participate in the discussion during the implementation stage, research and development may move in a direction that does not match the domain of civil engineering, and it may become difficult to utilize the latest research results in information technology. Moreover, the core of the system may be developed with reference to other fields. To avoid this, we must share a common vision for the development of AI technology and develop human resources accordingly.