Abstract
In the manufacturing industry, there is a shortage of human resources due to the declining birthrate and aging population, as well as a growing mobility of human resources. Video is an effective way to reduce the cost of human resource training and to efficiently communicate complex work procedures and skills that are difficult to verbalize. Video can communicate complex work procedures intuitively, but it is difficult to search for specific scenes or skip over unnecessary scenes. Since manual editing of work videos is time-consuming and labor intensive, there is a growing need to automate the analysis of work videos. The first step is to segment a video into action segments or to detect action boundaries based on the similarity of image features. In this paper, we propose an action boundary detection method using hand detection for unedited work videos taken by a fixed camera, such as cooking and parts assembly videos. By taking advantage of the fact that hand movements, such as picking up or moving objects, often occur at action boundaries, our method does not require a large amount of training data with action labels.