Proceedings of the Annual Conference of JSAI

EEG-Based Fear Level Classification using Graph Neural Network and LSTM Fusion Model: A study on Features and Validation Methods

Masato SUGATA, Nagisa MASUDA, Ikuko Eguchi YAIRI

Session ID: 4K1-IS-2d-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K1IS2d01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Electroencephalography (EEG)-based emotion recognition offers a noninvasive, cost-effective approach with applications in psychology, healthcare, and education. Accurate recognition of fear emotions is crucial for diagnosing and treating conditions such as phobias and anxiety disorders. This study classifies fear emotions into four levels using the DEAP dataset, leveraging Graph Neural Networks (GNNs) integrated with Long-Short-Term Memory (LSTM) networks. Two architectures, GIN-LSTM and ECLGCNN, were evaluated with raw EEG signals and Differential Entropy (DE) features. Performance was assessed using 10-fold and Leave-One-Subject-Out (LOSO) cross-validation, achieving a peak accuracy of 99.23% in 10-fold CV and 36.57% in LOSO CV, both surpassing prior studies. However, the LOSO results reveal limited generalizability to unseen subjects, highlighting the need for further research to enhance adaptability and robustness. This study demonstrates the potential of GNN-LSTM models for fear emotion classification and underscores the importance of addressing inter-subject variability to improve real-world applicability.

View full abstract

Download PDF (879K)
Prediction of radio wave attenuation in space-to-ground communications based on meteorological satellite data

Mai YOSHIKAWA, Yuma ABE, Dimitar KOLEV, Hiroyuki TSUJI, Ikuko Eguchi Y ...

Session ID: 4K1-IS-2d-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K1IS2d02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In pursuing an intelligent society, the deployment of Beyond 5G/6G is anticipated. A crucial aspect of realizing this vision lies in establishing a robust non-terrestrial network encompassing satellite-based communication systems. However, space-based communication faces challenges from atmospheric disturbances. For instance, Ka-band, a crucial frequency range for satellite communication, is attenuated by rain. Similarly, optical satellite communication links are disrupted by clouds. To ensure reliable and high-quality communication, it is imperative to accurately predict the impact of weather on signal propagation, enabling the selection of ground stations and modulation methods. This research focuses on developing a predictive model for radio wave attenuation during space-to-ground communication, leveraging data from meteorological satellites. The model's core is a deep learning architecture that integrates CNNs, renowned for their proficiency in image feature extraction. The rain attenuation prediction with this model achieved a high coefficient of determination. In addition, to improve the prediction accuracy, we analyzed the complex relationship between the radio wave reception strength and Himawari standard data, a comprehensive dataset acquired from the Himawari meteorological satellite.

View full abstract

Download PDF (455K)
Highly Accurate EEG-based Sleep Deprivation Detection Using Deep Learning

Ayaka ISHIHARA, Yoji YAMASHITA, Masato SUGATA, Ikuko Eguchi YAIRI

Session ID: 4K1-IS-2d-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K1IS2d03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Objective sleep deprivation detection can enhance workplace safety and productivity in professions that require long working hours. To address this, we proposed deep learning models for classifying sleep-deprived individuals using EEG data. In this study, we utilized resting-state EEG data collected from both sleep-deprived and well-rested participants and generated five datasets (EyesClosed, EyesOpen-Raw, EyesOpen-AR, EyesClosed+EyesOpen-Raw, and EyesClosed+EyesOpen-AR), then applied them to 1D CNN and 1D CNN-LSTM models. Both models achieved their peak performance with EyesOpen-AR, which slightly outperformed EyesOpen-Raw, while demonstrating comparable performance across all datasets. Applying feature extraction using differential entropy within delta, theta, alpha, and beta bands to the five datasets resulted in decreased performance. The results suggest that artifact-removal from EyesOpen-Raw is not essential for sleep deprivation detection using deep learning models. Additionally, they suggest that 1D CNN may be a more suitable choice for sleep deprivation detection, and non-feature-extracted data is more suitable than feature-extracted data.

View full abstract

Download PDF (681K)
Investigation of Feature Fusion Methods for Heterogeneous Graphs in EEG Emotion Recognition

Hiroto SASAKI, Masato SUGATA, Ikuko Eguchi YAIRI

Session ID: 4K1-IS-2d-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K1IS2d04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Graph neural networks, deep learning models designed for non-Euclidean data, have garnered attention in EEG-based emotion recognition. Recent studies explore EEG-based models and investigate multimodal models that incorporate peripheral physiological signals, such as electrooculography and electrocardiography, with ongoing research focused on feature fusion methods. The graphs used in GNNs for emotion recognition are generally constructed based on the spatial distance or the functional connectivity between channels; however, most models rely on only one type. This paper validates the effectiveness of a model that utilizes features from heterogeneous graphs and investigates various fusion methods inspired by multimodal approaches. As a result, the highest accuracy achieved was 93.87%, approximately 2% higher than that obtained using a single graph and comparable to existing methods. Furthermore, when synthesizing heterogeneous graphs, a technique that uses the embedding vector of the entire graph has proven to be more effective than one that considers individual channels.

View full abstract

Download PDF (318K)
Reinforcement learning algorithm combinations with Double DQN and Noisy Network for automated ICT system design

Hiroyuki HOCHIGAI, Yutaka YAKUWA, Natsuki OKAMURA, Tianchen ZHOU, Taka ...

Session ID: 4K2-IS-2e-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K2IS2e01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The design ICT systems that can provide application services that quickly and flexibly integrate network and software is currently the key method in DX based on ICT. Non-etheless, the problem of the huge amount of time required for the automatic design of ICT systems has been paid more and more attention. Considering the rapidly changing demand for ICT systems in various industries, the need to build and monitor systems frequently, and the difficulty in securing engineers due to the declining birthrate and aging population, the huge design time problem cannot be ignored, especially if AI is expected to reduce design time. To this end, we study the problem of reducing the design time of ICT systems with deep reinforcement learning algorithms , Weaver. First of all, we set the graph neural network as the learning model, so that deep reinforcement learning algorithm is used to solve this problem. Secondly, we have designed an algorithm that combines the representative reinforcement learning algorithm Double DQN and Noisy Network. Then, we attempted to shorten the design time by changing the normal distribution to a truncated normal distribution at the optimal value among them. Finally, sufficient trials are conducted to verify our proposed method. The results show that, the proposed method reduces the learning time required to complete learning of the design.

View full abstract

Download PDF (510K)
BF-GAN: Development of an AI-driven Bubbly Flow Image Generation Model Using Generative Adversarial Networks

WEN ZHOU, Shuichiro MIWA, Yang LIU, Koji OKAMOTO

Session ID: 4K2-IS-2e-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K2IS2e02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A generative AI architecture called bubbly flow generative adversarial networks (BF-GAN) is developed, designed to generate realistic and high-quality bubbly flow images through physically conditioned inputs, jg and jf. Initially, 52 sets of bubbly flow experiments under varying conditions are conducted to collect 140,000 bubbly flow images with physical labels of jg and jf for training data. A multi-scale loss function is then developed, incorporating mismatch loss and pixel loss to enhance the generative performance of BF-GAN further. Regarding evaluative metrics of generative AI, the BF-GAN has surpassed conventional GAN. Physically, key parameters of bubbly flow generated by BF-GAN are extracted and compared with measurement values and empirical correlations, validating BF-GAN's generative performance. The comparative analysis demonstrate that the BF-GAN can generate realistic and high-quality bubbly flow images with any given jg and jf within the research scope. BF-GAN offers a generative AI solution for two-phase flow research, substantially lowering the time and cost required to obtain high-quality data. In addition, it can function as a benchmark dataset generator for bubbly flow detection and segmentation algorithms, enhancing overall productivity in this research domain. The BF-GAN model is available online (https://github.com/zhouzhouwen/BF-GAN).

View full abstract

Download PDF (1250K)
Predicting Anime Movie Box Office Revenue: An Approach Based on Original Manga and Production Information

Norio NISHIOKA, Kenji TANAKA

Session ID: 4K2-IS-2e-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K2IS2e03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

With the growth of the anime movie market, predicting box office revenue has become increasingly important. Traditional movie revenue prediction research has mainly focused on factors such as news articles and reviews. However, there is a lack of research considering factors specific to anime movies. This study proposes an anime movie box office revenue prediction model that utilizes original manga sales related data, along with production information such as the director and original author, as features. Specifically, we construct models, using linear regression, multiple regression, and random forest, and compare their prediction accuracy. Using actual animated box office revenue data in Japan from 2012 and 2023, our evaluation suggests that the proposed model, particularly the simple linear regression model which uses the box office revenue by director, is effective in predicting the box office revenue of anime movies based on comics.

View full abstract

Download PDF (635K)
AI-Enabled Vehicle Motion Analysis: Fine-Tuning Open-Source LLMs for Real-Time Insights

Bersilin Charles ROBERT, Shotaro NISHIMURA, Ko UCHIDA, Hiroshi HONDA

Session ID: 4K2-IS-2e-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K2IS2e04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study explores a real-time vehicle motion analysis approach by fine-tuning open-source large language models (LLMs) using numerical vehicle data. Real-time motion analysis can help provide driving feedback, develop training modules for novice drivers, and assess driving performance using signals like speed, acceleration, and steering angle. A wide range of LLMs exists, including large-scale cloud-based models (e.g., GPT-4o, Gemini-2.0) and open-source solutions (e.g., Llama-3.1). Cloud-based models provide high-quality driving feedback but are expensive, slow, and not always available for real-time use. In contrast, open-source models are more accessible and can be deployed locally, but they struggle with understanding complex numerical data. To tackle this, we fine-tuned the LLaMA 3.1 model using data gathered from the Assetto Corsa racing simulator, which captures both typical and extreme driving conditions. Our model achieved 84.07% accuracy in classifying different driving behaviors, such as smooth braking and sudden acceleration, showing that fine-tuned LLMs can effectively interpret vehicle data.

View full abstract

Download PDF (1422K)
Enhancing Sound-Based Sleep Quality Assessment by Multimodal Knowledge Distillation

Haoyu LU, Takafumi KATO, Ken-ichi FUKUI

Session ID: 4K2-IS-2e-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K2IS2e05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Sleep is vital for physical recovery, brain function, and emotional health. Polysomnography (PSG) is the gold standard for assessing sleep quality, but it is intrusive and impractical for widespread application. Sound data is a non-intrusive alternative, though its complexity makes extracting meaningful information difficult. This study enhances sound-based sleep quality assessment using a Knowledge Distillation (KD) framework. The teacher model integrates PSG features, physical factors, sleep stage data, sound data, and questionnaire factors, using a Gated Variable Selection Neural Network (GVSN) to identify key information from multimodal inputs. The student model uses physical factors and sound features extracted from one night’s sleep events and learns from the teacher via a SoftMax-based KD process. Results show the student model's accuracy improves significantly, demonstrating the potential of KD to improve sound-based sleep quality assessment.

View full abstract

Download PDF (1066K)
Unraveling Global-to-Domain Events Influence based on Causal Graph

Ziwei XU, Ryutaro ICHISE

Session ID: 4K3-IS-2f-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K3IS2f01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Global events can significantly impact various sectors but detail the impacts from global events to a specific business remains complex. If these impacts could be represented explicitly, it will enhance event prediction across various domains and help to avoid risk. This paper addresses this challenge within the finance sector by proposing a method that uses causal graphs to link global events with company-specific business events through geographical and taxonomic mappings. We outline a framework encompassing event acquisition, geographic mapping, and event taxonomic mapping, and illustrate its application with an example, demonstrating how geographical and ontological factors reveal hidden influences on business events.

View full abstract

Download PDF (295K)
Generalized Few-Shot Siamese Semantic Segmentation with Pyramid Vision Transformer Backbone

Francis SANCO, Clifford BRONI-BEDIAKO, Massayasu ATSUMI

Session ID: 4K3-IS-2f-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K3IS2f02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Few-shot semantic segmentation enables pre-trained networks to generalize to new data with minimal labelled samples per class, addressing challenges of data scarcity and annotation cost. While few-shot learning methods have shown success, a more practical challenge lies in segmenting both base classes (pre-trained classes) and novel classes (new classes with few examples) in a single task. So, Generalized Few-Shot Semantic Segmentation(GFSS) was introduced, evaluating models on their ability to handle familiar and unseen classes. Existing approaches use VGG and ResNet backbones, but struggle with handling multi-scale features, which is crucial for segmenting varying size objects. Additionally, Siamese learning has proven effective for few-shot tasks but has not been widely explored in generalized few-shot learning. This paper proposes a novel solution by integrating Pyramid Vision Transformer (PVT), which introduces multi-scale features into transformers, with a Siamese Transformer Module(STM) for enhanced adaptation of support features to query features. Our approach aims to improve effectiveness and robustness of GFSS, addressing scale variation challenges and the need for better adaptation to novel class. Our work aims to: Show the capabilities of PVT for dense predictions Extend Siamese networks for GFSS

View full abstract

Download PDF (241K)
An Enhanced Two-Stage SFE with Adaptive Acceptance Selection and Flip-Flop Mutation for High-Dimensional Feature Selection

Yuefeng XU, Rui ZHONG, Junqi ZHANG, Chao ZHANG, Jun YU

Session ID: 4K3-IS-2f-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K3IS2f03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

SFE, a prominent optimization algorithm for solving high-dimensional feature selection problems, exhibits strong exploitation capability using a single search agent. However, its ability for global exploration remains relatively weak. This paper introduces a flip-flop mutation mechanism and adaptive acceptance selection into the SFE algorithm and proposes a targeted two-stage improvement to enhance its performance in high-dimensional spaces. Specifically, the following enhancements are made to the original SFE algorithm: (1) the search process is divided into two distinct stages, each with a different focus, (2) a random flip-flop mutation mechanism is incorporated, and (3) mitigation of trapping in localized solutions by adaptive acceptance selection. We evaluate the proposed algorithm on 21 high-dimensional datasets and compare its performance with the original SFE algorithm and six state-of-the-art binary evolutionary algorithms. Experimental and statistical results demonstrate that these improvements significantly enhance the global exploration capability of SFE, making it more robust and effective for addressing high-dimensional feature selection challenges.

View full abstract

Download PDF (808K)
Learnability of Regular Languages in Language Models

Masaya TANIGUCHI, Naoki NEGISHI, Yusaku NISHIMIYA, Keisuke SAKAGUCHI, ...

Session ID: 4K3-IS-2f-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4K3IS2f04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study explores the impact of the presentation order of positive and negative data on grammar acquisition in language models. We specifically focus on a text search problem, with the target grammar represented by a regular language. To conduct the study, we prepare two types of data: positive data, where sentences conforming to the target grammar are embedded within the text, and negative data, where such sentences are absent. Our findings demonstrate that both the sampling strategy for positive and negative data and the order in which these datasets are presented influence the language model's ability to acquire grammatical structures.

View full abstract

Download PDF (341K)
Prompt Compression Technique for Dialogue Navigation Systems based on Large Language Models using Murray's Theory of Psychogenic Needs

HIROAKI SHIMOMA, Takeshi MORITA

Session ID: 4L1-OS-36-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L1OS3601

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (518K)
Knowledge Graph Construction Using Graph RAG with an Ontology Tailored to Agricultural Management

Masahiro OTOMO, Akio KOBAYASHI, Junichi ISHIHARA, Tetsuo KATSURAGI, Ju ...

Session ID: 4L1-OS-36-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L1OS3602

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (868K)
Visualization of Indoor Reachability-Related Risks Using a Digital Human Model

Derivation of Initial Posture Candidates Based on Machine Learning Adapted to the Environment

Rei YAMAMOTO, Natsuki MIYATA, Yusuke MAEDA

Session ID: 4L1-OS-36-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L1OS3603

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study aims to develop a system that quantitatively evaluates and visualizes indoor areas that are both accessible to children and prone to accidents by analyzing the reach postures of a digital human model. The system supports caregivers in creating a safer environment. Since reach postures vary dynamically depending on location—for example, a child may support their body with one hand—these postures exhibit discontinuous mechanical modes. To optimize the final posture using an optimization method, it is crucial to provide an appropriate initial posture. Therefore, this study introduces a machine learning-based approach for estimating initial postures and quantitatively evaluates the preventive effects of existing accident prevention products, thereby verifying the effectiveness of the proposed system.

View full abstract

Download PDF (872K)
Willingness to Expose (WTE) : Understanding of the relationship between WTE and monitoring serivces for health

Mikiko OONO, Hanae YOSHINARI, Ayano NOMURA, Yoshifumi NISHIDA, Hisashi ...

Session ID: 4L1-OS-36-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L1OS3604

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, the development of sensing systems for disease prevention and health management has been expected in the field of health care. Sensing systems can gather biological information such as heart rate, blood pressure, and respiration, as well as daily activities, and analyze the state of physical and mental health to evaluate one’s health status. On the other hand, from the viewpoint of consumers, they need to make a comprehensive decision of what they want to know, the type of data to be used for health evaluation, and the risks associated with data collection. In this study, based on the concept of “Willingness to Expose (WTE),” proposed by the authors, we investigated the relationship between WTE and the type of service for health monotoring. The results showed that WTE was higher for those who were concerned about their fall risks, indicating that one's WTE would be increased when percerived reasonable benefits from sensor techonologies.

View full abstract

Download PDF (609K)
A Risk Assessment Method That Enables the Consideration of Daily Behavior and Environment

Yuya KAWABE, Naoki NOZAKI, Syunsuke SASAKI, Mikiko ONO, Kozi KITAMURA, ...

Session ID: 4L1-OS-36-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L1OS3605

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study proposes a novel risk assessment method that considers daily objects and behaviors in living spaces. The method enables us to assess the risk of injury occurrence in an environment by considering the arrangement of furniture, daily behavior, and a large-scale database of domestic accidents. To verify its feasibility, we developed a prototype system. This digital twin system consists of an accident database, 3D scanning from smartphones or devices, posture data measurement via RGB cameras, and real-time and post-event risk visualization. Validation experiments in a simulated environment confirmed that the system can assess changes in average injury severity due to different movement paths and object arrangements. Additionally, expert interviews were conducted using the prototype to identify potential applications and related challenges.

View full abstract

Download PDF (1190K)
A Study on the Development of a System for Preventing Left-Right Reversal of Chest X-ray Images Using U-Net

KEIGO OKADA

Session ID: 4L2-GS-10-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L2GS1001

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The X-ray images are sent to the image server via the imaging system to check the image resolution and patient misidentification. Chest X-ray images are usually taken in a standing position with back and front views, but for patients who cannot stand, the images are inverted to be taken in front and back views. However, if the image is inverted incorrectly and sent to the imaging system, and if the examiner is unaware of the inversion, an incident may occur in which a left-right inverted image is sent. We focused on the fact that the left and right lung fields have different shapes, and examined the feasibility of segmenting the contours and constructing a system to prevent left-right inversion. Using miniJSRT_database, a database of labeled chest X-ray images published by the Japanese Society of Radiological Technology, we constructed a U-Net-based segmentation model to extract lung field regions from chest X-ray images. We extracted lung field regions from 50 cases of chest X-ray images collected separately at Showa University Northern Yokohama Hospital using the above model. The IoU of the lung field region was calculated from the pseudo-inverted images and the original images, and showed a low value of 0.64. On the other hand, the IoU between the normal images showed a high value of 0.94. This indicates that the lung field segmentation can accurately detect false inversions in the images.

View full abstract

Download PDF (564K)
Narrowing down specific regions in craniofacial CT data in real clinical scenes

Soh NISHIMOTO

Session ID: 4L2-GS-10-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L2GS1002

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In machine learning, including machine learning, it is desirable to deal with a database that is organized under certain conditions. Currently, we are carrying out a project to collect craniofacial CT images used to diagnose facial bone fractures. The data from each facility is obtained with different CT equipment, exposing and imaging conditions are not standardized. The slice width is diverse. In our previous experience, accuracy improved by narrowing down the input region, when estimating feature point coordinates in public databases of craniofacial images. In this study, the previously mentioned data from real clinical scenes were processed to meet certain conditions, and cropped specific regions, using deep learning methods in combination. Results: The actual clinical data was able to be formatted in a certain way and narrowed down region almost automatically.

View full abstract

Download PDF (529K)
Exploratory study on delirium classification model using facial expression features

Takaaki NIHO, Akishige YUGUTI, Yosio MATSUMOTO, Kiyoko OTANI, Asao OGA ...

Session ID: 4L2-GS-10-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L2GS1003

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1656K)
Domain Adaptation Method Based on Evaluation of Individual Differences in EEG

Yuma YOKOTA, Kei SUZUKI, Kenichi INOUE, Midori SUGAYA

Session ID: 4L2-GS-10-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L2GS1004

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (549K)
Investigation of Methods for Constructing Ensemble Model for Estimating Severity Level of Depression Using Machine Learning with EEG Data

Kenichi INOUE, Kei SUZUKI, Midori SUGAYA

Session ID: 4L2-GS-10-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L2GS1005

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (541K)
Estimating Human Judge Scores through Post-Hoc Annotation of LLM-as-a-Judge

Takeshi TESHIMA, Kenta SHINOTSUKA, Yuchi MATSUOKA

Session ID: 4L3-OS-38-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L3OS3801

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (343K)
Can ’Explanation’ be managed as AI quality?

Ryuichi OGAWA, Shigeyoshi SHIMA

Session ID: 4L3-OS-38-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L3OS3802

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (475K)
Building a Model with Minimal Parental Effort: Leveraging Human-in-the-Loop Learning and VLM for Emotional State Estimation in Children with PIMD

Kota MOCHIDA, Teppei NAKANO, Mari WAKABAYASHI, Tomomi SATO, Tetsuji OG ...

Session ID: 4L3-OS-38-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L3OS3803

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (447K)
User Evaluation of Datasets for Machine Learning Competitions

Takeaki SAKABE, Yuko SAKURAI, Satoshi OYAMA

Session ID: 4L3-OS-38-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L3OS3804

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study, we evaluate user perceptions of datasets used in machine learning competitions. Datasets for such competitions are often selected in an unsystematic manner, making it challenging to choose datasets that align with participants' skill levels. To address this issue, we conducted a user study as part of the evaluation of a framework for automated dataset generation. In this study, participants trained and tested models using both existing datasets and those generated by the framework, and we collected test results based on algorithm selection and performance variations. The impact of datasets on participants' performance was examined using objective metrics. The results of our evaluation indicate that the generated datasets exhibit properties more suitable for educational competitions compared to existing datasets.

View full abstract

Download PDF (306K)
Human-LLM Hybrid Answer Aggregation for Crowd Annotations

Jiyi LI

Session ID: 4L3-OS-38-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4L3OS3805

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Whether Large Language Models (LLMs) can surpass crowdsourcing in data annotation tasks has gained interest recently. Some works verified this issue with the average performance of individual crowd workers and LLMs on some specific tasks. However, the aggregated answers are the eventually collected annotations, rather than the crowd answers themselves. The scenarios involving crowd answer aggregation need further study. Our studies concentrated on two types of annotations including categorical labels and text answers. On the one hand, we studied the scenario of answer aggregation on the crowd categorical labels in the classification tasks and LLMs are used as creators of the labels. We propose a Crowd-LLM hybrid aggregation method, finding that We propose a Crowd-LLM hybrid aggregation method, finding that adding LLM labels from good LLMs to existing crowdsourcing datasets can enhance the quality of the aggregated labels of the datasets. On the other hand, we also explored text answer aggregation and assess LLMs as aggregators in close-ended crowd text answer scenarios. We proposed a hybrid aggregation approach within a Creator-Aggregator Multi-Stage (CAMS) framework. Experiments demonstrate that our approach can further improve the answer quality based on the combinations of three resources of workers and answers. These findings have been published at ICASSP2024 and EMNLP2024.

View full abstract

Download PDF (458K)
Predicting Trust Dynamics for Human-AI Cooperation

Sota KANEKO, Seiji YAMADA

Session ID: 4M1-OS-14a-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M1OS14a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (446K)
SVM-based Cognitive and AI Performance Models for Trust Calibration AI

Takumi TSUJIYAMA, Seiji YAMADA, Takashi ONODA

Session ID: 4M1-OS-14a-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M1OS14a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

If trust in AI breaks down due to over-trust or under-trust by humans, achieving high performance in human-AI collaborative decision-making becomes difficult. To address this issue, the human-AI trust relationship needs to be optimized by adaptively calibrating trust. Against this background, prior research proposed a trust calibration AI that automatically detects over-trust or under-trust and encourages humans to calibrate their trust in AI. This AI requires a cognitive/AI performance model to estimate the problem-solving ability of both humans and AI. However, specific methods to create this model do not exist at present. Therefore, this study proposes a method to construct a cognitive/AI performance model using Support Vector Machine (SVM) classification model. To evaluate the effectiveness of this method, an experiment was conducted using a chest X-ray interpretation task as a case study.

View full abstract

Download PDF (2644K)
Emotional XAI: Emotional Explanation Strategies for AI Agents with Utility Functions

Mibuki TAKAGI, Kazunori TERADA

Session ID: 4M1-OS-14a-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M1OS14a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (501K)
Towards Evaluation of User Decision-Making with Support from Nudge XAI

Yosuke FUKUCHI, Seiji YAMADA

Session ID: 4M1-OS-14a-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M1OS14a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (319K)
Mitigating Algorithm Aversion in Medical Professionals: Investigating the Relationship between Psychological Factors and AI Output Usage Rates

Keito MIYAKE, Kumi OZAKI, Seiji YAMADA

Session ID: 4M1-OS-14a-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M1OS14a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

AI technology in healthcare has made remarkable progress, with continuous improvements in diagnostic support accuracy and efficiency. However, medical professionals sometimes prioritize human judgment over AI despite recognizing the high performance of the systems they use, a phenomenon known as "algorithm aversion". In healthcare settings, medical errors remain a serious concern, and algorithm aversion may lead to overlooking human errors that AI support systems could prevent. Therefore, properly addressing algorithm aversion is essential for improving safety where AI assistance is available. This study quantitatively analyzes how psychological factors influence AI output usage rates (reliance rate) in shaping medical professionals' attitudes toward AI systems, focusing on their sense of control and responsibility. The analysis employs questionnaire items to examine correlations with reliance rates through statistical analysis. The findings are expected to guide human-AI interactions in medical settings while contributing to the theoretical foundation for addressing a crucial challenge: the collaboration between humans and AI.

View full abstract

Download PDF (357K)
Helping statements by agents influence trust and empathy toward the agent regardless of the agent's identification

Takahiro TSUMURA, Seiji YAMADA

Session ID: 4M2-OS-14b-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M2OS14b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

When a person is assisted by another person, he or she may want to give back to that person. In recent years, more and more agents have been collaborating with people, but because of differences in individual abilities, not all agents perform tasks with the same capacity. In this study, we investigated whether empathy and trust toward an agent are promoted when a person and multiple agents perform a common typing task within a time limit and one agent performs the remaining task instead. To investigate whether the principle of retroactivity toward agents is at work, we also prepared four different colors of agents; 392 participants showed that people do not identify with agents individually, and that being helped by an agent has a high impact on empathy and trust. The results indicate that evaluations of agents who help people also increase evaluations of similar agents who did not help people, which may help make agents more acceptable in a society where the use of agents is increasing.

View full abstract

Download PDF (461K)
From Self-Sacrifice to Social Connection

Examining the Role of Robot Altruism in Human Prosociality

Chenlin HANG, Masahiro SHIOMI, Seiji YAMADA

Session ID: 4M2-OS-14b-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M2OS14b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study investigates the impact of robot self-sacrifice on human trust and prosocial behaviors. While existing research in human-robot interaction (HRI) often delves into moral dilemmas, such as the trolley problem, this work shifts focus to practical contexts where robots engage in altruistic actions, like sacrificing their own battery to assist a user. In an experiment involving 30 participants, results revealed that robots exhibiting self-sacrificial behavior significantly encouraged prosocial actions, although perceptions of the robots' social attributes remained unchanged across conditions. These findings suggest that while self-sacrifice does not necessarily enhance how robots are perceived, it can effectively promote cooperative behaviors among humans. This research contributes to the development of socially interactive robots capable of fostering prosocial dynamics in human-robot coexistence.

View full abstract

Download PDF (521K)
The Impact of Environmental Adaptation of Voice Guide Agents and User Engagement on Trust in Real Environments

Mari SAITO, Seiji YAMADA

Session ID: 4M2-OS-14b-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M2OS14b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (745K)
Interactive-SmartClerk: A Recommender Robot System Sharing Preferences with Customer

Hikaru MATSUZAKI, Tomoyuki MAEKAWA, Kentaro ISHII, Michita IMAI

Session ID: 4M2-OS-14b-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M2OS14b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We propose Interactive-SmartClerk, a robot system for recommending items to a first-time customer by mimicking a store clerk. Interactive-SmartClerk solves the cold-start problem by asking the customer's preferences for a few items. Even if the number of samples of the customer's preferences is small, Interactive-SmartClerk can make recommendations that match the customer's preferences. Even if the customer's preferences are very different from the general preferences, recommendations can be made that are different from the current trends by capturing the individuality of the customer's preferences. In the current study, we designed Interactive-SmartClerk to recommend three dresses from a dataset of 100 dress images. We collected 647 people's preferences for the dress images through cloud sourcing. We evaluated the performance of Interactive-SmartClerk and found that it performed better than the baseline system that always recommends commonly preferred dresses. We also found that Interactive-SmartClerk adapted to the preferences of minority people who preferred dresses that were not widely preferred. We carried out a case study using a real robot, and Interactive-SmartClerk recommended appropriate dresses for people with various preferences.

View full abstract

Download PDF (1313K)
Image-Text Synthesis based on Impression Utterance

Ryuki MATSUOKA, Shiro KUMANO, Michita IMAI, Hiromi NARIMATSU

Session ID: 4M2-OS-14b-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M2OS14b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (507K)
Evaluation of Values Controllability in Large Language Models

Yuki SAKAMOTO, Takahisa UCHIDA, Hiroshi ISHIGURO

Session ID: 4M3-OS-14c-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M3OS14c01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (300K)
The Influence of Personality Prompts in Multi-Issue Negotiations Using LLM

Motoaki SATO, Sota KITADA, Kazunori TERADA

Session ID: 4M3-OS-14c-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M3OS14c02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (702K)
Aspect-Based Novel Summarization with Relational Extraction Using Large Language Model

Ryuhei MIYAZTO, Hsin-Tai WU, Kei HARADA, Kazushi OKAMOTO, Atsushi SHIB ...

Session ID: 4M3-OS-14c-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M3OS14c03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Automatic document summarization is a technique that extracts important elements from an original document and condenses them into short sentences. In this research, we propose an aspect-based summarization method for novels that generates summaries focusing on specific aspects. Specifically, we utilize Large Language Model (LLM) to extract the relationships between characters and events within the text and identify the aspect of each part of the novel based on these relationships. Subsequently, we collect the parts corresponding to the target aspect and generate a summary sentence for each aspect. To evaluate the summaries, we compare the answer accuracy to question-answer (QA) pairs created from the set of sentences corresponding to each aspect, using both the original document and the generated summary as references. The results demonstrate that the proposed method can generate summaries that comprehensively reflect the target aspect, a capability that was difficult to achieve with conventional methods.

View full abstract

Download PDF (609K)
Generating educational resources using large-scale language models based on context expressed in radiological, anatomical and pathological images.

MASATO TANAKA, TAKESHI HARA, TSUGUMI MATSUO, SEIJI YAMADA

Session ID: 4M3-OS-14c-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4M3OS14c04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[Purpose] We furthered the study "Learning and understanding lung structure described by the context of radiological, anatomical, and pathological images" presented at the 38th Japanese Society for Artificial Intelligence in 2024, and attempted to generate learning content with the accuracy that can actually be used by students by inputting individual image data that form the context and the linguistic information that explains them into LLM (GPT-4o). [Method] LLM learning was performed using prompt learning and fine-tuning. For prompt learning, multi-modal data was used, which used image data and linguistic information as learning data. The output was educational content in a Q&A format. [Results] When generating educational resources using one-shot learning with prompts, the resulting Q&As were highly logical and promoted understanding of the input learning information. On the other hand, when learning using Fine-Tuning, Q&As containing incorrect information that deviated from the learning data were occasionally found. Based on this experiment, we believe that generating learning resources using Prompt Engineering is suitable for practical use.

View full abstract

Download PDF (636K)
Accent detection for television speech sound

Marina MIKAMI, Takuya MATUZAKI

Session ID: 4N1-GS-7-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N1GS701

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The purpose of this study is to create an accent dictionary for low-frequency words by automatically detecting the phonetic accent position of words based on TV speech data, in which many low-frequency words such as proper nouns and new words appear. First, the fo value of each phoneme, mora pronunciation, mora position within a word, and part of speech were extracted as features from the speech data, and a classifier was created using these as input. The model was trained on speech data from Corpus of Spontaneous Japanese (CSJ) and used to predict the accent positions of the nouns that appear in LaboroTVSpeech. The accuracy of the resulting accent dictionary was only 77-86%.

View full abstract

Download PDF (734K)
Efficient Data Sampling Strategy for Speech Recognition Based on Diversity and Uncertainty

Komei HIRUTA, Yosuke YAMANO, Hideaki TAMORI

Session ID: 4N1-GS-7-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N1GS702

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (409K)
A Study on the Generation of Pseudo-Anomalous Data for Anomaly Detection in Machine Sounds

Satoshi KAWAMURA, Kohei YAMAMOTO, Hideaki TAMAI

Session ID: 4N1-GS-7-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N1GS703

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The detection of anomalous sounds is crucial for the efficient operation and maintenance of factory machinery. However, obtaining real-world anomalous data is often impractical, limiting the training of anomaly detection models. To address this issue, classification-based approaches utilizing pseudo-anomalous data have garnered significant interest. Pseudo-anomalies can be generated by treating normal sounds from non-target machines as anomalies or by applying perturbations to normal sounds using neural networks. While the former approach necessitates machine-specific models, the latter may fail to adequately capture the unique characteristics of target machines. This study investigates various sound data augmentation techniques for generating pseudo-anomalies and systematically evaluates their combinations to improve anomaly detection performance. Experimental results demonstrate that certain augmentations substantially enhance detection accuracy. Furthermore, we analyze the individual impact of each method and discuss the influence of augmented sound data on anomaly detection models.

View full abstract

Download PDF (394K)
Fundamental Study on Sound Field Estimation Using Deep Operator Networks with Room Shape

Gen SATO, Yusuke IKEDA

Session ID: 4N1-GS-7-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N1GS704

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Sound field simulation is used in various applications, such as acoustic design of concert halls, and generation of sound fields in virtual spaces. In particular, sound field simulation based on the wave equation can provide highly accurate results. However, estimation time needs to be improved. Therefore, sound field estimation using deep learning has been proposed to reduce the computation time required for the simulation. On the other hand, when deep learning is used, the estimated sound field is typically limited to a fixed grid. Therefore, we proposed Deep Operator Networks (DeepONets) to estimate the sound field at arbitrary locations. In particular, we aim to improve generalization performance of DeepONet for different room shapes by introducing a convolutional neural network into the Branch network of DeepONet. Experiments were conducted in rectangular room sound fields, and the results demonstrated that DeepONet can estimate the sound field with a high accuracy, achieving a mean SNR of approximately 24 dB.

View full abstract

Download PDF (675K)
Correction of Speech Recognition Errors using Word Pronunciation Information to Improve Speech Recognition Accuracy in Medical

Tasuku KITADE, May Phyo KHAING, Masanori TSUJIKAWA, Koji OKABE, Ryo IS ...

Session ID: 4N1-GS-7-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N1GS705

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We are investigating a medical documentation assistant system that aims to improve the efficiency of record and report creation by physicians by automatically generating medical documents from the recognized results of speech. For this system, it is essential that medical terminology is recognized with high accuracy. Difficult to obtain medical data, we propose a method to correct speech recognition errors using word reading information without using it. Specifically, we detect speech recognition errors from the recognition results using a Large Language Model (LLM) and obtain the readings of words identified as recognition errors through morphological analysis. Furthermore, we extract words similar to those readings and finally select the appropriate word from them using the LLM to correct the recognition errors. Evaluation experiments using simulated medical speech recognition results confirmed that the proposed method achieved a 12.9% reduction in errors of medical terms.

View full abstract

Download PDF (592K)
Enhancing Insights into Traffic Accident Risk with Multimodal Retrieval-Augmented Generation

Motoki CHIBA, Osamu ITO, Takumi IIDA

Session ID: 4N2-GS-7-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N2GS701

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

With the advancement of autonomous driving technologies, accurately estimating traffic accident risks has become increasingly important. Advanced Driver-Assistance Systems (ADAS), which use sensors for obstacle detection and provide collision mitigation and evasive steering, have reduced traffic accidents. However, the diversity and complexity of accident scenarios limit the potential of traditional ADAS technologies to achieve further reductions in accidents. Recently, Vision-Language Model (VLM) has been applied to the autonomous driving field. While VLMs possess broad knowledge and achieve reasonable accuracy in traffic scene understanding, they struggle to evaluate accident risks involving detailed and complex factors. Fine-tuning VLM specialized for traffic accident risk estimation is necessary, but the significant cost of data collection and annotation poses practical challenges. This study proposes a traffic accident risk explanation method using multimodal Retrieval-Augmented Generation (RAG) to improve explanation performance efficiently with minimal data. By leveraging a small amount of manually annotated data for retrieval and reference, the proposed method enhances explanatory capabilities for previously unseen images. Experimental results show that the proposed method improves traffic accident scene understanding compared to baseline model.

View full abstract

Download PDF (1169K)
Proposal of Automatic Measurement Method for Borehole Core Evaluation Index Using Visual Foundation Model with Normal Estimation.

Masahiro OKANO, Tomoyasu NANAUMI, Shuhei TSUKUI, Junuchiro FUJII

Session ID: 4N2-GS-7-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N2GS702

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, due to the decreasing number of skilled engineers in the civil engineering field, there is an increasing demand for further efficiency improvements in tasks such as geological surveys. The quality evaluation of cores in boring surveys relies on visual observation and measurement by skilled engineers based on multiple evaluation indicators, such as maximum core length and Rock Quality Designation (RQD). Moreover, conventional automation methods have been limited to recognition through image binarization processing, requiring parameter adjustments depending on the imaging environment and geological variations. This study focuses on maximum core length and RQD, which have relatively clear quantitative criteria and can be determined from images. As a generalizable automatic measurement method that does not require additional training, usage of the features of normal maps obtained from an image-based foundation model is proposed. To verify the effectiveness of the proposed method, accuracy evaluations is conducted by using practical industry standards (maximum core length: MSE is 3 or less, RQD: MSE 10 or less) as evaluation criteria. The results confirmed the feasibility of automatic measurement within practical application standards.

View full abstract

Download PDF (1098K)
Estimation of Editing Regions for Image Manipulation via Sketches Using Machine Learning

Taketo SASAKI, Ryohei ORIHARA, Yasuyuki TAHARA, Akihiko OHSUGA, Yuichi ...

Session ID: 4N2-GS-7-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_4N2GS703

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1401K)

Register with J-STAGE for free!