2025 年 6 巻 1 号 p. 1-7
This study addresses the labor shortage and aging workforce in Japan’s construction industry by exploring the effectiveness of training with digital tools that visualize the “knack” of skills through superimposed eye-gaze points during a machine excavation task. The gazing point video of excavator during work was taken prior to the test. A web-based questionnaire test was conducted, with the Test Group watching the eye-gaze superimposed video and the Control Group watching the same video without gaze points. While overall responses between the groups were similar, question-level analysis revealed variations, suggesting that the eye-gaze visualization method shows potential effectiveness, although further exploration is needed to improve methods to express and convey these insights.
After its peak in 1992, the investment in the construction industry declined sharply until 2011, then it rose back in recent years recovering to approximately 67 trillion yen in 2022. However, the number of workers in the industry has decreased steadily, particularly among skilled technicians. Construction technicians declined from 4.55 million in 1997 to 3.02 million in 20221) (Fig. 1).
Furthermore, 35.9% of construction workers are aged 55 or older, while only 11.7% are under 29- indicating a more pronounced aging workforce compared to other industries1) (Fig. 2).
Indeed, it is now common to see construction technicians in their 70s still working on-site. This demographic shift presents a major challenge in transferring skills and knowledge to the next generation.
Moreover, this imbalance between labor demand and supply has made Japanese infrastructure increasingly vulnerable. For instance, a large number of infrastructures such as bridges, sewages, etc. require maintenance or full-scale renovation due to aging-related deterioration. 2)3) However, workforce shortages have resulted in a significant backlog of projects, escalating the risk of serious infrastructure failures. As a result, human resource management has become one of the most pressing concerns in the construction industry.
Many critical construction skills rely on tacit knowledge, acquired through extended timeframe observation and hands-on experience. However, the slow pace of workforce ability development, combined with demanding working conditions, has made younger generations reluctant to enter the construction industry. If tacit knowledge can be converted into an explicit and structured knowledge, it would be possible to accelerate skill acquisition, enhance construction expertise, and attract new generations of workforces. This study is a part of an industry-academia collaborative research to visualize and quantify the “knack” of skills by integrating ergonomics, design engineering, data science, and civil engineering since 2022.
The study focuses on excavation work, a critical aspect of construction where underground utility lines such as electricity, water, and gas pose significant risks of damage. Although management records exist, precise pre-construction data is often unreliable due to ongoing repairs and replacements by different utility companies. While automation and unmanned operations are advancing in construction for sustainable operation of infrastructures, taking aging and declining population into consideration, human involvement remains essential, particularly in urban areas where excavation requires careful and experience-based judgment.
Skill transfer research has been conducted in various fields, including nursing, traditional crafts, and manufacturing. However, few studies have addressed the entire process-from visualizing tacit knowledge to developing and validating digital tools for training.4) As the first step of our study, this test explores the effectiveness of digital tool-based training by superimposing the gaze points of skilled workers during excavation, as clarifying focal points can be thought to be key to efficient work5) and skill proficiency6)7)8).
The conclusion contributes to the development of AR-based training methods aimed at enhancing safety, efficiency, and knowledge retention in excavation work.
This study was conducted among 85 randomly selected students from different universities whose majors are not directly related to construction, to minimize the variance of prior knowledge of excavation. Participants who are assumed to have neither interest nor prior experience in the construction industry are divided into the Control and Test Group.
By selecting individuals with no background in construction, the impact of AR based training methods can be assessed under impartial conditions. If positive significant results are observed even among participants with minimal interest or experience, it suggests that the tool would be also useful for young technicians who already have some familiarity with the field.
Prior to the test, the gazing point video of a skilled worker during excavation was taken in our training facility using EMR-9 (Fig. 3). EMR-9 is an eye-gaze tracking device that can record the movement and enlargement of experiment subject’s pupils.
Our training facility is a replica of an actual excavation site, with different types of underground pipe installed, created for young technicians to practice excavation work (Fig 4).
Afterward, the video was edited using EMR dFactory, a complementary software to EMR-9 that enables time-series quantitative analysis of gazing points. The analysis of this test used the tool that enables highlighting items in the subject’s field of vision using different colors (Fig. 5).
The test uses a web questionnaire (Fig. 6) that includes video showing skilled workers’ view, having experience of 26 years as a construction technician, during machine excavation. The questions are in both English and Japanese, where students chose their preferred language.
Students were allowed to watch the video several times but once they finished watching the video, they will answer the questions. There were two types of question: 1) Multiple choice questions 2) a Free short answer question. Students can answer “Yes”, “No”, and “Not sure” for the first nine questions, with the addition of the last question being an open-ended short answer question, asking the knacks of construction workers, based on the video.
The 10 questions are as below:
Q1. During excavation, operator works by focusing on the caterpillar tracks to avoid falling into the hole.
Q2. Operators check the angles of the arm and boom to ensure excavation at the appropriate depth during excavation.
Q3. When excavating near the area of underground pipes, the operator focuses only on avoiding contact between the bucket and the underground pipes.
Q4. The operator performs excavation based on instructions (visual information) from an assistant. Q5. When both big and small-diameter underground pipes are within field of view, operator always focus on the small-diameter underground pipe that is easier to be broken while excavating.
Q6. Operators do not need to focus on the walls because excavation is carried out only within the designated area enclosed by sheet piles.
Q7. Before turning the heavy machinery, operators check the position of the dump truck and then load the soil.
Q8. When turning the heavy machinery, operators always keep an eye on the bucket.
Q9. During excavation, operators carefully monitor the area around the boom because it becomes the blind spot.
Q10. Based on the video, what do you think are the “hidden knacks” of heavy machinery excavation?
The Control Group was only given workers’ field view video without the gazing points, while the Test Group was provided with an eye-gaze superimposed video during construction work (Fig. 7).
The comparison of accuracy rates between the Control and Test Group was conducted using Welch’s t-test. Group comparisons of accuracy rates for each question were analyzed using the χ2 test.
The differences in distribution between the Control and Test Group for responses such as “Yes”, “No”, and “Not Sure” as well as the aggregated categories of “Yes/No” and “Not Sure” were also analyzed using the χ2 test. In all cases, the significance level was set at 5%. Additionally, responses to open- ended questions were analyzed using text mining with KH Coder, including the extraction of frequently occurring terms and co-occurrence network analysis.
(1) Multiple choice questions
The accuracy rate of average correct answers is (31.4±28.5) % for the Control Group, and (31.0±24.4) % for the Test Group. The distribution of answers is shown on Fig. 8 and Fig 9.
In general, the accuracy rates and score distributions of the two groups demonstrated comparable performance levels. Although the Test Group exhibited a higher frequency of participants with 5-6 correct answers, the Control Group had more participants with 3-4 correct answers.
The percentage of participants with correct answers per question and Score distribution is shown in Fig. 10 and Fig. 11, respectively. Welch Test t- distribution p-value is 0.49, while the χ2 test p-value is 0.23.
From Fig 10, both groups exhibited a high correct answer rate for Question 3 and Question 6, whereas the opposite trend was observed for Question 2, Question 4, Question 7, and Question 8.
Further analysis of each question, however, showed a significant difference between groups for Question 1 and Question 9. None of the participants in the Control Group answered “Not Sure” for Question 1, whereas 7.9% of participants in the Test Group did (Fig. 12). The Control Group had a correct answer rate of only 12.8% for Question 9, while the Test Group had a rate of 31.6% (Fig. 13). The Cramér’s V Statistic test showed that the p-values for participants answering, ‘Not Sure’ to Question 1 and ‘No’ to Question 9 were both less than 0.05.
(2) Free short answer question
KH Coder text mining result of most frequently occurring words, particularly, nouns and verbs, revealed an interesting difference. While the Control Group have the tendency to use similar vocabularies, the Test Group answered with larger variation of keywords related to eye-gaze points. The bar graph of Most frequently occurring words and verbs can be seen in Fig. 14 - Fig. 17.
Furthermore, Fig. 18 and Fig. 19 illustrate the co- occurrence network analysis of the last free response answer regarding “knacks” of professional technician. Similarly, the Control Group mostly selects the same words, whereas the Test Group network is more widely spread, indicating a rich word selection and usage.
This section discusses the similarities and significant differences observed from the per question analysis, exploring possible reasons and limitations, considering potential implications for future research and practice.
The similar high correct rate trends (Fig. 10) observed for Question 3 and Question 6 are believed to be attributed to relatively low difficulty of these questions. The term “only” in both questions may also be the reason for this unexpectedly similar behavior of participants in both groups.
For instance, in Question 3, participants could infer that operator also need to avoid touching the retaining wall, even without seeing the focal points. The term “only” led participants to think that it must be impossible for operators to merely see bucket and underground pipes during excavation. In Question 6, operators, indeed, need to focus on the wall, as it marks the “limit” of the excavation area. As a result, most participants selected “No” as the answer, leading to a higher correct answer rate.
On the contrary, the low correct answer rate for Question 4 and Question 7 (Fig. 10) can be explained by the nature of the questions, which can be answered using common sense or general knowledge.
In Question 4, participants generally assume the role of an assistant on a construction site is to provide visual cues. However, according to interviews with skilled technicians, in reality, they focus on the excavation points while listening to assistant’s oral guidance for areas that are difficult to see. Similarly, in Question 7, it seemed to be commonly understood that before loading the soil, operators need to check the position of the dump to ensure proper alignment and efficient operation. These two reasons may cause to participants selecting “Yes” as the answer, lowering the correct answer rate.
The test result implies that common sense still plays a dominant role in how participants answered the questions. This suggests that overreliance on common sense may be one of the reasons for the technicians’ current slow-paced skill development. Therefore, technicians need to challenge traditional assumptions and move beyond what is commonly regarded as common sense in developing technical skills. Instead, incorporating the practical knowledge and “knacks” of skilled workers into training could be a more effective approach to skill acquisition.
For future research, it is important to reconsider the questions posed, ensuring they are neither too simplistic nor based on common knowledge. This will help to obtain more meaningful and insightful responses.
In addition, comparisons between groups in the distribution of responses to questions 1 and 9 yielded an interesting result.
For Question 1, the difference between the two groups suggests that by focusing on operator’s gaze point, participants have the tendency to answer “Not sure” as caterpillar is barely inside the field of vision (Fig. 12). In contrast, for Question 9 the difference seems to arise from the boom not being marked as one of the operator’s eye gaze points (Fig. 13).
Moreover, in the co-occurrence network analysis, showing the usage of words combination distribution, the Control Group used more general key terms, while the Test Group used words related to operator’s gaze for the last free short answer question. Without focal point guidance, the Control Group seems to respond based on common sense, leading to more generic answers (Fig 14 and Fig 16).
Interestingly, the term ‘excavate’ did not appear at all in the Test Group’s responses. Since ‘excavate’ is not a likely term to be used in describing the ‘knacks’ of the excavation work itself, the video seems to have unintentionally directed participants’ focus to the items seen during the work-that is, the operator’s gazing point. As a result, they were guided to perform a more detailed analysis of excavator’s knacks, answering the question without using the easily visualized yet superficial term ‘excavate’. This approach resulted in a wider variety of responses (Fig. 15 and Fig. 17).
The detection of specific objects or items indicated by the operator’s gaze points in the Test Group was in line with the findings of previous research9), which highlighted the effectiveness of visualizing gaze behavior in conveying focal points. It can be inferred that superimposing gaze points onto real-world visual field footage holds potential as a method for communicating the cognitive processes of skilled operators during their tasks. However, to enhance this effect, further investigation into methods of information presentation and related factors is expected.
The test results indicate no substantial overall difference in the effectiveness of the training tool between the two groups. However, question-level analyses revealed significant variations in both qualitatively and quantitatively.
These findings highlight the need to refine the skill visualization methods to better engage users and capture their attention. Rather than merely displaying a video with gaze points in different colors, incorporating arrows that highlight specific gaze points, along with supplementary information such as detailed illustrations of correct bucket positions or tips from skilled workers, can better guide construction technicians during their training. This is particularly important in addressing labor shortages, as such a tool would allow skilled workers to provide little to no guidance, saving them time by reducing the need for direct teaching. In turn, young technicians train more effectively, helping to alleviate both the challenges of an aging workforce and the ongoing labor shortage.
Engagement can be encouraged by simulating a game-like experience in excavation training, where technicians will be rewarded upon completing each stage and gradually progress to the next phase. By establishing clear goals, technicians are motivated to complete the training with enthusiasm while internalizing the skilled workers’ knacks. Not only that this approach boosts their focus, but also it creates a sense of achievement, making training more effective and enjoyable.
The current method of visualizing eye-gaze points certainly still requires improvement. By using interactive digital tools that stimulate engagement, it has the potential to accelerate technicians’ learning curves efficiently, creating an innovative and unconventional training experience. This method can help technicians develop their skills in a more engaging way, offering a solution to address labor shortages and the aging population within the construction industry.