International Journal of Activity and Behavior Computing
Online ISSN : 2759-2871
Volume 2024, Issue 2
Displaying 1-15 of 15 articles from this issue
  • Ryuichiro Okuda, Qingxin Xia, Takuya Maekawa, Takahiro Hara, Sozo ...
    2024 Volume 2024 Issue 2 Pages 1-17
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    With the proliferation of Information and Communication Technology (ICT), recording systems for care activities using smartphones and tablets are becoming widespread in the field of nursing care. As a result, it is becoming possible to predict future caregiving activities using the care record history recorded in the systems. The prediction of future caregiving activity enables us to develop various caregiving applications, such as support for the preparation of future caregiving activity, detection of missing entries in nursing care records, and prior information for real-time caregiving activity recognition using sensors. However, while the recording has become easier with the introduction of the recording systems, there are still many missing nursing care entries. When data with many missing entries is used as training data for activity prediction methods, the performance of the prediction methods will deteriorate. In this paper, we propose a caregiving activity prediction method that is robust against missing entries. The proposed model has a module for correcting missing entries, and its intermediate output is used to estimate whether or not a certain caregiving activity will occur from the current time to one hour later using the caregiving activity records of the past T hours. We evaluated the effectiveness of the proposed method using data obtained from actual nursing care facilities.
    Download PDF (702K)
  • Takuya Maekawa, Naoya Yoshimura, Jaime Morales, Takahiro Hara
    2024 Volume 2024 Issue 2 Pages 1-21
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    This study introduces OpenPack, a comprehensive dataset developed for the recognition of packaging work activities. The availability of sensor datasets for recognizing work activities in industrial settings has been constrained due to the challenges in obtaining realistic data, which often requires close cooperation with industrial sites. This limitation has hindered research and development in industrial application methods based on activity recognition. OpenPack comprises 53.8 hours of diverse sensor data, encompassing acceleration data, keypoints, depth images, and readings from IoT devices such as handheld barcode scanners. This data was gathered from 16 participants with varying degrees of experience in packaging work. We apply various human activity recognition models to the dataset and provide future directions of complex work activity recognition studies in the activity recognition community based on the results. In addition, we organized an activity recognition competition, OpenPack Challenge 2022, based on the OpenPack dataset. This paper also intro- duces lessons learned from organizing the competition. The OpenPack dataset is available at https://open-pack.github.io/.
    Download PDF (4766K)
  • Elsen Ronando, Sozo Inoue
    2024 Volume 2024 Issue 2 Pages 1-22
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    In this paper, we improve the classification performance of fatigue detection on physical activity using Large Language Models (LLMs). Fatigue is a critical reminder of a person’s health condition. Currently, studies on fatigue detection solely focus on sensor data to measure body condition. However, changing sensor data for more meaningful inference and improving fatigue detection performance needed to be developed. In this study, we implement LLMs for fatigue detection in physical activity. We use LLMs in the preprocessing steps to generate meaningful features that can improve classification performance. For evaluation, we study the prompt design of LLMs to investigate their effect on improving machine learning performance, and compare evaluation metrics between traditional machine learning and LLMs-based machine learning. Using LLMs, our proposed model achieves better performance for fatigue detection, especially in physical activity, with performance improvement of 2% minimum and 4.5% maximum.
    Download PDF (1095K)
  • Defry Hamdhana, Kazumasa Harada, Hitomi Oshita, Satomi Sakashita, ...
    2024 Volume 2024 Issue 2 Pages 1-17
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    In this paper, we provide prompts to extract values or short texts for body temperature, appetite, daily activities, and mental state from automated transcripts of nurses’ words. However, because the conversation is intended to be as relaxed as possible, there are challenges in summarizing these dialogues into specific information that matches the care records. For example, some information in the transcript does not clearly convey certain information. In response to these challenges, we propose using Large Language Models (LLMs) to extract transcript information related to care records automatically. This paper investigates the effectiveness of LLMs in automatically identifying and extracting relevant care record information from communication sessions through automated transcription, with the ultimate goal of simplifying the documentation process and improving the quality of services provided to the elderly population. As a result, the prompts we generate for compact output types, such as body temperature and appetite, produce outputs that closely match the ground truth data. In thirteen documents tested with this prompt, eight of them both body temperature and appetite, produced information that was in line with ground truth.
    Download PDF (1070K)
  • Hoang Anh Vy Ngo, Noriyo Colley, Shinji Ninomiya, Satoshi Kanai, ...
    2024 Volume 2024 Issue 2 Pages 1-24
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    In this paper, we present an approach to assess nurses’ skills by using activity recognition in the context of Endotracheal Suctioning (ES) performed by nurses which is an important nursing activity. Our proposed structure for skill assessment hinges on three aspects: the activity order, suction time, and the smoothness exhibited during the suctioning process. Our order score algorithm works correctly in ground truth and identifies correctly mistakes on Not remembering to remove PPE before auscultation in activity recognition results compared to a professional nurse's evaluation. The recognized suction time is similar to the ground truth with only 1 to 2 seconds. The analysis of suctioning smoothness shows a similar trend to force data that nurses performed ES more smoothly by putting less pressure on the catheter than students. To recognize ES activities, we extract pose skeletons from multi-view videos, using a dataset including nurses and nursing students performing ES. Our methodology involves extracting pose skeletons from front and back views and enhancing model performance with skip frames, post-processing, and using micro labels for training, then evaluating with macro labels. After using multi-view data and training with micro labels, our proposed method improves the accuracy by 4% and the F1-score by 9%. By combining multi-view pose extraction, advanced post-processing, and a nuanced skill assessment framework, our work contributes to advancing activity recognition in endotracheal suctioning, fostering a deeper understanding of nurses' proficiency in this critical medical procedure.
    Download PDF (4677K)
  • Milyun Ni’ma Shoumi, Sozo Inoue
    2024 Volume 2024 Issue 2 Pages 1-27
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    In this paper, we are using comprehensively review the ways in which Large Language Models (LLMs) advance activity recognition systems, discuss the challenges of implementing LLMs, and compare results between LLM-based methods and traditional approaches. We study the basic concepts of LLMs, subsequently, we systematically analyze the researches that have used LLMs for activity recognition, along with the areas related to these tasks, including object detection and speech recognition, since activity recognition can incorporate object detection and speech recognition techniques in its process to improve accuracy and provide a more comprehensive contextual understanding of human activities. We analyze the insights from 26 related research works using the Systematic Literature Review (SLR) approach. By synthesizing recent research, this review shows that LLMs can be applied in various stages of the activity recognition process, where 10% of surveyed paper are implemented at the data collection stage, 10% at the data preprocessing stage, 50% at the feature extraction stage, and 30% papers at the model training stage. Therefore, the data collection and data preprocessing stages allow for more in-depth exploration of opportunities to integrate LLMs at both stages. Moreover, LLMs offer several advantages over traditional methods, including efficient feature extraction, superior performance compared to widely used techniques, robustness across a wide range of data sets, and important enhancements that lead to state-of-the-art performance.
    Download PDF (962K)
  • Tensei Muragi, Airi Tsuji, Kaori Fujinami
    2024 Volume 2024 Issue 2 Pages 1-25
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    In assembly work, identifying individual skill levels of workers and situations in which they need assistance are critical to providing assistance by a system without interfering with other workers. We hypothesized that the determination of type and level of confusion that occurs during assembly work can facilitate the identification of worker states. In this paper, we present a method for confusion detection and classification. Positional information from the hand and gaze was used to detect the presence of confusion if any, confusion type (i.e., searching for assembly parts and mounting the parts in a specific position), and its strength (i.e., weak and strong confusion). We used two types of classification features: gaze-transition and histogram-based. Moreover, to improve the confusion-classification performance, we classified confusion in a hierarchical manner. The results shows that F1-scores of 0.529 and 0.511 were obtained in the five-class classification with and without hierarchical classifier formation, respectively. We also integrated the classification pipeline into a working-system prototype using reject-option processing. A user study was conducted to validate classification performance.
    Download PDF (4562K)
  • Iqbal Hassan, Nazmun Nahid, Md Atiqur Rahman Ahad, Sozo Inoue
    2024 Volume 2024 Issue 2 Pages 1-23
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    This paper presents a systematic review focusing on motion analysis-based emotion estimation in the elderly. Addressing a critical concern, it highlights the challenge of effectively monitoring emotions in older adults and emphasizes the potential development of serious disorders resulting from emotional neglect. The study underscores the importance of emotional well-being in care facilities, where the willingness of elderly individuals to receive care is closely tied to their emotional state. Health practitioners often encounter difficulties when elderly individuals resist care due to emotional dissatisfaction, making monitoring changes in emotional states essential and necessitating comprehensive care records. Through an exhaustive examination of existing literature, the paper suggests that motion-based emotion recognition shows promise in addressing this challenge. Utilizing the PRISMA protocol, the study conducts a qualitative analysis of the impact of motion analysis on emotion estimation. It outlines the current methodologies employed in research and reveals a significant correlation between body motion cues and emotional states in the elderly. Furthermore, it positions motion-based emotion estimation as a viable solution for addressing emotional well-being in older adults and offers guidelines for researchers interested in this area. Based on our study we consider the first review of this kind on motion-based emotion estimation for the elderly, providing insights into potential advancements in addressing emotional well-being in this demographic.
    Download PDF (323K)
  • Shunsuke Miyazawa, Guillaume Lopez
    2024 Volume 2024 Issue 2 Pages 1-15
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    The quantification of sports movements enables performance and skill evaluation to assist players in improving their skills. In the shoot motion of basketball, joint motion is essential, especially elbow, shoulder, and wrist motion. In previous studies, most systems used cameras, and financial and environmental conditions may influence their use. On the other hand, systems using wearable devices evaluate little indicators and provide insufficient information for technical assistance. This study uses a smartwatch to determine whether the user uses a wrist snap when shooting free throws. Preliminary experiments showed that the short-term energy (STE) value of the x-axis acceleration is more prominent when the user snaps the wrist than when he doesn’t. We experimented to verify the accuracy of the wrist snap detection threshold. Twenty-one players (6 experienced and 15 inexperienced) shot ten free throws each. We evaluated the accuracy of wrist snap detection by the rate of agreement with wrist snap judgment based on video images. Detection accuracy reached 78.5% with optimal threshold setting and overcame 75% within a specific threshold range. However, we observed a significant difference between the accuracy for experienced players (62.7%) and inexperienced players (84.7%). One of the reasons for the lower accuracy of the experienced players was that some of them shot without using their knees, which may have caused the STE value to increase and exceed the threshold. A possible cause of the misjudgment of shots by inexperienced players is that the magnitude of the STE value may have changed depending on whether the ball reached the ring. Since there is little difference in overall detection accuracy around the optimal threshold value, it is possible to adapt it according to user characteristics. As prospects, it is necessary to find an index not affected by the use of the knee and to add the arrival position of the ball to the index for judgment to improve the accuracy of the snap use judgment in this study. Besides, we would like to increase the feedback items related to wrist motion and posture at the time of shooting.
    Download PDF (2744K)
  • Takahiro Ueno, Masayoshi Ohashi
    2024 Volume 2024 Issue 2 Pages 1-16
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    Detecting and intervening early in situations of harassment is crucial. Recent research has suggested techniques for detecting harassment using sensor data. Recognition of the victim’s subjective stress in the face of various aggressor behaviors is essential for detecting harassment. This paper presents a method for detecting harassment based on heart rate variability (HRV) and the victim’s subjective stress when the aggressor’s behavior changes. The article focuses on harassment by verbal insults and proposes a comparative analysis of HRV in response to two types of behavior. HRV data were collected from participants watching videos simulating harassment and relaxation situations. Two types of Harassment videos were used, in VR and 2D versions, each showing different verbal insult behaviors. Comparative analysis revealed that the subset of features effective for binary Harassment/Relaxation classification was the same for both types of harassment. The datasets derived from each type of stalking emotion were analyzed with respect to the accuracy of the affective state classification. The results show that an increase in the subjects’ subjective stress and immersion levels significantly influenced these measures for both sets.
    Download PDF (1737K)
  • Akihisa Tsukamoto, Kenji Mase, Yu Enokibori
    2024 Volume 2024 Issue 2 Pages 1-21
    Published: 2024
    Released on J-STAGE: June 13, 2024
    JOURNAL OPEN ACCESS
    Deep-learning-based IMU-based (Inertial measurement unit) Human Activity Recognition (HAR) has a problem of lack of large datasets. This problem would be solved if integrated utilization of small labeled datasets was established. However, this solution is not easy because there are feature-space differences among datasets depending on the IMU installation locations, recording environment, characteristics of used IMU, etc. To solve this problem, we adjusted the differences between datasets to enable their integrated utilization using the MIG HAR Dataset, consisting of synchronized data from 396 IMUs deployed throughout the whole body. Our approach adjusts sensor characteristic differences among datasets and among sensor positions by a transform method that is trained to adjust the characteristic differences among the MIG HAR’s sensors, selected by similarity between MIG HAR’s sensors and each sensor in each dataset. The results of an evaluation showed a macro-F1 score of 0.7915 when the MIG HAR, PAMAP2, SHO USC-HAD, and MotionSense datasets were used as training data, and the RealWorld dataset was used as test data. This score was also approximately 0.0321 points higher than the baseline macro-F1 score of 0.7594 obtained by leave-one-subject-out cross-validation of RealWorld.
    Download PDF (7206K)
  • Nazmun Nahid, Iqbal Hassan, Xinyi Min, Naoya Ryoke, Md Atiqur Ra ...
    2024 Volume 2024 Issue 2 Pages 1-26
    Published: 2024
    Released on J-STAGE: June 14, 2024
    JOURNAL OPEN ACCESS
    In this paper, we present a method for integrating a human behavior model into robot motion control to enable safer intimate distance Human Robot Collaboration (HRC). This approach establishes safety parameters based on personality and experience, and optimizes the system through observing human reactions. It integrates a behavior pattern-based emergency shutdown. In our experiment, we tried to validate our claim that incorporating a human behavior model into the robot control will increase the safety of the system in intimate distance conditions. Validation through a mixed-reality approach demonstrates the feasibility of the framework in a simulated environment, ensuring ethical considerations and safety. Notably, it outperforms traditional benchmarks, and other forecasting based approaches, achieving zero collisions in 100 trials and exhibiting a forecasting error rate below 10mm. Despite notable improvements, challenges persist, including residual time delays in safety compensations and potential slowdowns for introverted, inexperienced workers. While these limitations need further refinement, the proposed approach signifies a substantial stride towards safer HRC, successfully preventing collisions in intimate distance conditions.
    Download PDF (5527K)
  • Keisuke Sato, Guillaume Lopez
    2024 Volume 2024 Issue 2 Pages 1-21
    Published: 2024
    Released on J-STAGE: June 16, 2024
    JOURNAL OPEN ACCESS
    The proposed system, CoreMoni-α, aims to support core training and promote health maintenance through training, especially during the COVID-19 pandemic when access to gyms and outdoor activities is limited. The system accurately evaluates a user’s posture from an inertial motion unit and provides long-term feedback through a wearable sensor and a smartphone. This system consists of two parts. There are two functions: one that supports training in real-time and one that allows users to look back on their training scores and compete with their rivals. A three-stage method is used for real-time posture determination. There are three stages: calibration, judgment, and real-time feedback. In order to make appropriate judgments for each user, calibration is performed to take into account the degree of curvature of each user's waist. After that, a threshold value judgment is performed within the judgment program, and appropriate feedback is provided to the user via the smartphone, depending on the judgment result. The main focus of this research is on long-term feedback. This time, in addition to real-time feedback functions, we focused on long-term feedback. Long-term feedback has two main functions. The first is that you can view your training results as a history. By visualizing the judgment results in text and pie charts, users can check what changes their own training has produced so far, and they can check their own training results objectively over the long term. The second feature is a ranking function that allows users to compete with other users. The ranking changes depending on the ratio of posture judgment results (good and bad) and counting frequency, which can stimulate their competitive spirit. This can be expected to maintain and improve motivation. Evaluation experiments were conducted to verify the effectiveness of the system. Twenty subjects in their 20s to 50s participated. The experiment was divided into a group with the system and a group without the system, and the experiment was conducted for about two months. We will evaluate things such as how long users were able to continue using the system, whether their posture improved compared to two months ago, and what kind of difference was made with or without the system. The results confirmed that training using this system improved trunk stability and muscle strength. Furthermore, according to the subjects' self-evaluation, the use of the system contributed to increasing their motivation for training.
    Download PDF (2141K)
  • Kaito Takayama, Shoko Kimura, Guillaume Lopez
    2024 Volume 2024 Issue 2 Pages 1-17
    Published: 2024
    Released on J-STAGE: June 16, 2024
    JOURNAL OPEN ACCESS
    Presentations are frequent in social life, and adequately conveying information is essential. However, many people feel nervous about presenting or giving a presentation in front of a large group, leading to poor performance and a wrong impression on the audience. Nervousness often comes from inexperience, fear of failure, and awareness of how the audience perceives them, and it takes a lot of time and practice to ease this feeling. According to a survey of comedians who have been performing for over a year, about 90% felt nervous during rehearsals or just before performances. Thus, it is clear that having many opportunities and practicing a lot does not necessarily relieve the tension. On the other hand, 80% of them answered that the audience's reaction alleviates nervousness. We developed the Laugh Detection and Tension Reduction System (LTRS) to support comedians leveraging nervousness during their performances. The LTRS detects the audience's laughter from Heart Rate Variability (HRV) and transmits it to the performer as vibrotactile feedback. We evaluated the effect of LTRS on the performer's tension, both quantitatively and qualitatively, using HRV and the System Usability Scale (SUS), respectively. The quantitative evaluation results suggested that the LTRS slightly reduced nervousness, but there was no significant difference. In addition, the LTRS usability evaluation revealed individual differences in its feeling of use. We expect to achieve more effective tension reduction by improving the laughter detection method and reconsidering the feedback modality to the performer.
    Download PDF (622K)
  • Arie Rachmad Syulistyo, Yuichiro Tanaka, Hakaru Tamukoh
    2024 Volume 2024 Issue 2 Pages 1-22
    Published: 2024
    Released on J-STAGE: June 27, 2024
    JOURNAL OPEN ACCESS
    Identifying nursing activity during critical procedures, such as endotracheal suction (ES), is crucial for ensuring patient safety and the quality of received treatment. The expansion of home care requires an increase in the number of certified professionals who can perform endotracheal procedures and provide monitoring during these activities. To fulfill these needs, this study aims to develop an algorithm that is able to recognize ES activities that have the potential to be implemented on edge devices and perform real-time processing of nurse’s pose keypoint extracted from the video using YOLOv7, which is represented in x and y coordinates. The edge device implementation is crucial for health care for ensuring security and privacy, and reducing cost network congestion and latency. In this study, we introduce a combination of a reservoir computing (RC)-based recognition model and large language models (LLMs) to identify nursing activities related to endotracheal suction. RC is suitable for edge device implementation because of its low computational cost requirement and processes temporal features necessary for recognizing nursing activity in real-time. To enhance the performance of RC, we introduce a reservoir computing model with multiple readouts for the recognition model, called RCMRO. The proposed model, which uses LLMs to analyze keypoint data and generate synthetic training data to improve the performance of RCMRO, shows promising performance in distinguishing between various nursing activities. This tool provided healthcare professionals with a prospective method to monitor and evaluate nursing activity in real-time and achieved an accuracy of 70.5% and an F1 score of 68.1% when evaluated by using a test dataset.
    Download PDF (1414K)
feedback
Top