Proceedings of the Annual Conference of JSAI

Real-World Robot Control by Deep Active Inference with Temporally Hierarchical World Model

Kentaro FUJII, Shingo MURATA

Session ID: 1B3-OS-41a-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B3OS41a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Deep learning-based robots are expected to achieve various goals in real-world environments. To realize this, it is essential to handle environmental uncertainties using both goal-directed and exploratory actions. Deep active inference (DAIf) offers a promising approach but suffers from high computational costs and requires strong representational capabilities for modeling environmental dynamics. To address these challenges, we propose a novel DAIf framework. The framework comprises a hierarchical world model, an abstract world model, and an action model. The hierarchical world model learns environmental dynamics by introducing a temporal hierarchy, enhancing its representational capability. The action model learns latent states of action sequences as abstract actions. The abstract world model learns the relationship between the hierarchical world model's representation and the abstract actions, reducing computational costs. Robotic object manipulation experiments in uncertain environments demonstrated that the framework reduced computational costs compared to conventional approaches, achieved diverse goals, and generated exploratory actions to address environmental uncertainties.

View full abstract

Download PDF (427K)
Multimodal Prompt Analysis in Robotic Manipulation Tasks

Daiki TAKAHASHI, Masahiro SUZUKI, Yutaka MATSUO

Session ID: 1B3-OS-41a-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B3OS41a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study, we analyzed multimodal prompts in a robot manipulation task, focusing on the interaction between textual and visual inputs. Using the VIMA benchmark, we evaluated the effects of modality dependence and the input order of observation tokens on the task success rate. The results revealed an overdependence on specific modalities and input order, indicating important issues in achieving robust multimodal learning. Our findings contribute to improve the generalizability of models in robot tasks.

View full abstract

Download PDF (764K)
6D Multi-View NewtonianVAE: A World Model-Based Approach for 6D Pose Estimation and Control

Mai TERASHIMA, Katsuyoshi MAEYAMA, Pedro Miguel Uriguen ELJURI, Yuanyu ...

Session ID: 1B3-OS-41a-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B3OS41a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study, we propose a method for learning a latent space representing 6D poses and performing 6D control using NewtonianVAE. NewtonianVAE, as a type of world model, learns the dynamics of the environment as a latent space from observational data and performs proportional control based on the estimated position. By using NewtonianVAE, position estimation can be achieved based on the internal dynamics of the environment rather than an external coordinate system. While previous studies have applied Newtonian VAE to translational control, 6D control has not been investigated. To address this, we propose 6D Multi-View NewtonianVAE (6D-MNVAE), which extends the latent space by incorporating rotation vector. In our experiments, we evaluated whether 6D-MNVAE can estimate 6D poses in the latent space and perform 6D control towards a target pose. Experimental results showed that 6D-MNVAE achieved 6D control with an accuracy within 7 mm and 0.02 rad. Furthermore, our method does not require feature engineering or annotation and enables 6D control using only RGB image information.

View full abstract

Download PDF (830K)
Real-World Robot Control via Play Data Augmentation with a World Model

Yuta NOMURA, Shingo MURATA

Session ID: 1B3-OS-41a-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B3OS41a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The development of generalist robots capable of performing diverse tasks in various environments is highly anticipated. While imitation learning and reinforcement learning are effective approaches, they present a trade-off between generalization ability and data efficiency, often requiring large amounts of data to achieve high generalization. To address this challenge, we introduce play data, collected through human teleoperation driven by curiosity. This data serves as expert demonstrations with high generalization potential but may require additional data for out-of-distribution tasks. To overcome this limitation, we propose a play-based action generation framework that augments play data within a world model. By learning from both real and synthetically generated play data, the framework enables robots to generate actions toward various goal states. Additionally, autonomous data collection within the world model reduces reliance on real-world data collection. Experiments in both simulated and real-world robotic environments demonstrate that the proposed framework improves generalization ability and data efficiency by facilitating novel data collection within the world model.

View full abstract

Download PDF (681K)
Self and Opponent Modeling for Ensuring Markovian and Reward-Predictive Representations in Partially Observable Multi-Agent Environments

Kai YAMASHITA, Masahiro SUZUKI, Yutaka MATSUO

Session ID: 1B3-OS-41a-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B3OS41a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recent advances in reinforcement learning for multi-agent environments have underscored the importance of Opponent-Modeling, where agents infer internal states or strategies of opponents. Recent studies have explored AutoEncoder-based latent representations that limit access to opponent information during execution for Opponent-Modeling in partially observable environments. In reinforcement learning, the state input to the policy and value function in a Markov decision process (MDP) must satisfy the Markov property and serve as a sufficient statistic for future reward prediction. However, under partial observability, many opponent modeling approaches focus solely on reconstructing opponent information in the latent representation, without ensuring that it retains Markovian or reward-predictive properties. To overcome this limitation, we propose a representation learning method that models not only the opponent but also the agent itself. We validated our method through experiments, demonstrating its effectiveness in improving reinforcement learning performance.

View full abstract

Download PDF (605K)
Object-Centric Transformer World Models and Causality-aware Policy

Yosuke NISHIMOTO, Takashi MATSUBARA

Session ID: 1B4-OS-41b-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B4OS41b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1612K)
Learning Hierarchical State Space Models via Surprise- and Uncertainty-based Chunking

Tomoshi IIYAMA, Masahiro SUZUKI, Yutaka MATSUO

Session ID: 1B4-OS-41b-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B4OS41b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Complex real-world tasks are often long-horizon, making world models that can accurately predict far into the future crucial for AI agents. Hierarchical state-space models, which incorporate temporal hierarchies in latent states, have shown promise for long-term prediction by segmenting time series into subsequences and learning temporal abstraction. However, existing methods relying on rigid subsequence length assumptions or significant changes in observation often perform poorly in environments where optimal subsequence lengths vary or environmental changes occur gradually. This study proposes a method for learning hierarchical state-space models based on the discovery of frequently occurring, highly reusable patterns, drawing insights from chunking mechanisms in cognitive science. Our method extracts frequent patterns by utilizing changes in surprise and uncertainty in low-level latent states. Leveraging these patterns to learn high-level latent states reduces the complexity of transitions, enabling efficient long-term prediction. Experiments on video prediction tasks show that our method outperforms the baselines, underscoring the effectiveness of hierarchical structures derived from frequent patterns for long-term prediction.

View full abstract

Download PDF (478K)
Deep Active Inference Framework for Mobile Robot Exploration and Navigation

Riko YOKOZAWA, Kentaro FUJII, Yuta NOMURA, Shingo MURATA

Session ID: 1B4-OS-41b-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B4OS41b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (553K)
Cooperative Multi-Agent Reinforcement Learning Based on World Models and Signal Sharing

Taisuke TAKAYAMA, Naoto YOSHIDA, Tadahiro TANIGUCHI

Session ID: 1B4-OS-41b-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B4OS41b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (587K)
On the Robustness of Object-Centric Representations for Model-Based Reinforcement Learning

Akihiro NAKANO, Masahiro SUZUKI, Yutaka MATSUO

Session ID: 1B4-OS-41b-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B4OS41b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Model-based reinforcement learning (RL) is a promising approach to learning to control agents in a sample-efficient manner, but often struggles with generalization beyond tasks it was trained on. While previous work have explored using pretrained visual representations (PVR) to improve generalization, these approaches have not outperformed representations learned from scratch in out-of-distribution (OOD) settings. In this work, we propose to incorporate object-centric representations, which have demonstrated strong OOD generalization capabilities by learning compositional representations, into model-based RL with PVR. We investigate whether this object-centric inductive bias improves both sample efficiency and task performance across in-distribution and OOD environments.

View full abstract

Download PDF (1065K)
Web Agent with Meta-Prompt-Driven Expert Integration.

Satoi YAMAGUCHI, Yuna MASTUNAGA, Takayasu IKEDA, Masahiro SUZUKI, Yuta ...

Session ID: 1B5-OS-41c-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B5OS41c01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This research explores a novel approach to task completion that does not rely on extensive pre-training in the rapidly evolving field of Web Agents. A major challenge existing Agents face is the automation of tasks accompanying image recognition. However, previous methods highlight limited compatibility between image recognition performance and approaches that do not require pre-training. To address this limitation, we propose an approach that strategically integrates several expert methods employing meta-prompt within LLM, achieving advanced environmental analysis that enables performance improvement. We evaluate the proposed approach using MiniWob++. Additionally, we compare existing Agents to the proposed approach to access task success rate. This paper offers insight into the potential of integration of meta-prompt using LLM to improve task completion rate, suggesting the possibility of a decrease in the necessity of extensive data collection and training required by current agents.

View full abstract

Download PDF (706K)
Evaluation of Offline Pretraining Methods for World Models Using Instruction Expansion with Large Language Models and Two-Stage Pretraining

Yusei KOEN, Yuji FUJIMA, Yasuhiro TAKEDA, Makoto KAWANO, Yutaka MATSUO

Session ID: 1B5-OS-41c-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B5OS41c02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recent studies have demonstrated that offline data, such as text, can significantly enhance the efficiency of task learning through the pretraining of world models. In particular, Dynalang has demonstrated its effectiveness in leveraging task instructions and environmental dynamics to enhance performance. However, its application has been primarily limited to the Messenger task, leaving its generalizability to other tasks and the impact of text type and quality in pretraining insufficiently explored. In this study, we extend Dynalang's approach to the simpler HomeGrid task to evaluate its generalizability. We also explore the use of large language models (LLMs) to generate and expand domain-specific text, aiming to further improve initial task performance and sample efficiency. Additionally, we propose and assess a two-stage pretraining strategy: general text is first used to develop fundamental language understanding, followed by domain-specific text to strengthen task-specific capabilities. Our findings highlight the potential of expanding the applicability of text-based pretraining strategies.

View full abstract

Download PDF (459K)
The performance of Large Language Models on Wisconsin Card Sorting Task and an analysis of the answer

Daiki GOTO, Hayato IDEI, Yuji SHIOZUKA, Tetsuya OGATA

Session ID: 1B5-OS-41c-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B5OS41c03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (374K)
JDERW: Japanese LLM Deduction benchmark requiring a world model

TAISEI OZAKI, Takumi MATSUSHITA, Tsuyoshi MIURA, Shohei TANIGUCHI, Yut ...

Session ID: 1B5-OS-41c-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B5OS41c04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recent studies suggest that Large Language Models (LLMs) demonstrate capabilities beyond simple next-token prediction, leading to discussions about their potential acquisition of world models. This paper introduces Basic-JDERW, a deductive reasoning benchmark dataset that requires fundamental world understanding. The dataset comprises 103 QA tasks that necessitate the application of basic world models, ranging from physical phenomena comprehension to common sense reasoning and action planning, categorized into six types: causal reasoning, temporal reasoning, spatial reasoning, abstract concept reasoning, common sense reasoning, and planning. Through evaluation experiments with eight LLMs, we analyzed model performance across categories and examined correlations with existing benchmarks. Notably, llama3.3-70B-instruct demonstrated superior performance in categories requiring physical understanding, such as temporal and spatial reasoning. This research offers new perspectives on evaluating basic world understanding capabilities glimpsed through LLMs' reasoning abilities and aims to contribute to understanding the relationship between linguistic reasoning and world comprehension capabilities.

View full abstract

Download PDF (364K)
Perspective for Generative Emergent Communication

Is a large-scale language model a collective world model?

TADAHIRO TANIGUCHI, Ryo UEDA, Tomoaki NAKAMURA, Masahiro SUZUKI, Akira ...

Session ID: 1B5-OS-41c-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1B5OS41c05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study proposes a unified theoretical framework called "generative emergent communication" (generative EmCom) that connects emergent communication, world models, and large language models from the perspective of collective predictive coding. This framework not only allows us to understand symbol emergence as (decentralized) representation learning of external symbols within a machine learning framework but also enables us to interpret large language models as collective world models that integrate experiences from multiple agents. This research provides a novel perspective linking world models and symbol emergence, while discussing future research directions.

View full abstract

Download PDF (926K)
EMS construction including behavioral change model for EV users

Riho HOSHUYAMA, Masaki INOUE

Session ID: 1D3-OS-24a-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D3OS24a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this work, we focus on the utilize of EV storage batteries in energy management systems. EV storage batteries are expected to complement the electricity demand and supply from renewable energy sources. However, the most important role of EVs is as a means of transportation. In order to collect electricity in this context, EV users who plan to go out are given incentives to change their usage schedule as mobility. We modeled this behavioral change and devised an energy management system that compensates for power shortages at the lowest possible cost.

View full abstract

Download PDF (1185K)
Relationship Modeling between Time Preference and Information Seeking Behavior in Jub Hunting

Yuumi GOTO, Shuhei YAMAMOTO

Session ID: 1D3-OS-24a-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D3OS24a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (385K)
Predicting Consumption Behavior with LLMs Considering Behavioral Economics Characteristics

Hikaru UMADA, Shuhei YAMAMOTO

Session ID: 1D3-OS-24a-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D3OS24a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (392K)
Intervention Strategies to Control Defamatory Posts for Athletes on SNS Considering Social Preferences

OSUKE HAYASHI, Shuhei YAMAMOTO

Session ID: 1D3-OS-24a-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D3OS24a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (377K)
Generating Exercise-Promoting Messages by Utilizing Event Information through RAG

Ren TOMIYAMA, Yoshiaki TAKIMOTO, Takeshi KURASHIMA, Hiroyuki TODA

Session ID: 1D3-OS-24a-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D3OS24a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1123K)
Composer Classification from Classical Scores Using Vision Transformer and Evaluation of Relation to Modern Music

Hayato YASUDA, Akiko TAKAHASHI, Yusuke FUKAZAWA

Session ID: 1D4-OS-24b-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D4OS24b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study, we propose a model that treats the sheet music of classical composers as image data and uses image recognition techniques for composer classification. Furthermore, we predict the relevance of classical composers' styles in the sheet music of modern composers, quantifying which classical composer's style is related to modern composers.

View full abstract

Download PDF (669K)
Research Paper Recommender System by Considering Users’ Information Seeking Behaviors

Zhelin XU, Shuhei YAMAMOTO, Hideo JOHO

Session ID: 1D4-OS-24b-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D4OS24b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

With the rapid growth of scientific publications, researchers need to spend more time searching for papers that align with their research interests. To address this challenge, paper recommendation systems have been developed to help researchers in effectively identifying relevant paper. One of the leading approaches to paper recommendation is content-based filtering method which recommend papers based on the overall similarity of papers. However, studies on user information seeking behaviors indicate that, in addition to evaluating the overall similarity, researchers also pay attention to specific sections of a paper to assess their relevance to their interests. For instance, users may check the method section to determine whether a candidate paper utilize method they are interested in. In this paper, we propose a content-based filtering recommendation method that takes this information seeking behavior into account, aiming to provide users with more relevant papers. Specifically, in addition to considering the overall content of a paper, our approach also considers three specific sections (background, method, and results) and assigns weights to them to better reflect user preferences. We conduct offline evaluations on the DBLP dataset, and the results demonstrate that the proposed method outperforms six baseline methods in terms of precision@5, recall@5, MRR, and MAP.

View full abstract

Download PDF (257K)
Feature analysis for dynamic live streaming content recommendation considering the relationship between streamers and audience

Makoto TAKEUCHI

Session ID: 1D4-OS-24b-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D4OS24b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The content recommendation problem for live-streaming platforms presents several unique challenges not present in other media platforms. First, the recommended content is ad-hoc CGM content generated by the live streamer. Because it is dynamic, the audience is permanently restricted to choosing from the content being streamed when they visit the platform. In addition, the content is diverse and changes in real-time, influenced by the communication between the audience and the live streamer, so there is also the problem of obtaining the streamed content's features. Furthermore, it has been pointed out that audience behavior patterns on live-streaming platforms include exploring new streamers and exploiting known streamers. Understanding and appropriately capturing the dynamics of these viewer states is considered essential for a live-streaming content recommendation but has not been sufficiently studied. In this study, to investigate appropriate approaches to these problems specific to live-streaming platforms, we analyzed viewing behavior using user logs of a live-streaming platform.

View full abstract

Download PDF (1377K)
Evaluation Functions of Algorithmic Recourse Incorporating Feature Selection Based on Relevance to Decision Criteria

Takumi ITO, Tomu TOMINAGA, Takeshi KURASHIMA

Session ID: 1D4-OS-24b-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D4OS24b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Algorithmic recourse provides counterfactual action plans –recourse– for users to overturn negative AI decisions. It typically assumes that minimizing an objective function, which measures the distance between a user’s current and desired state, generates acceptable recourse. However, recent studies question this assumption, highlighting the need to revisit the objective function. In this study, we propose a novel objective function that excludes the influence of features irrelevant to AI decisions. These features are identified based on their correlation with, importance in predictions of, or users’ self-reported irrelevance with decision outcomes. The proposed approach ensures such features remain unchanged in recourse. Using experimental data from a user study with a loan application scenario, we confirmed that minimizing the proposed objective function improves recourse acceptability. User self-reports were particularly effective in identifying irrelevant features. Based on these results, we discussed future directions for enhancing user-centered algorithmic recourse generation incorporating users’ prior knowledge.

View full abstract

Download PDF (447K)
Learning Generative Model of ECG from PPG and Application to Activity Classification Task

Yuta NAMBU, Masahiro KOHJIMA, Ryuji YAMAMOTO

Session ID: 1D4-OS-24b-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D4OS24b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Heart information, especially ECG, is often used to estimate a person's internal state and behavior. However, accurate ECG measurement requires specialized equipment and the cooperation of medical staff, and although wearable devices that can measure ECG have appeared, there are some limitations. In contrast, since PPG measures blood flow in the wrist, it can only indirectly obtain heart information, but it can be measured more affordably and continuously with smartwatches and other devices. Therefore, this study focuses on generating ECGs signal from easily measurable PPGs. The method utilizes Rectified Flow to learn the shortest path between two distributions, enhancing ECG signal quality by incorporating peak position information from both signals. Experiments with the dataset in which ECGs and PPGs were measured simultaneously showed that our approach yields higher-quality ECGs than traditional diffusion models. Additionally, using generated ECGs as training data enhances activity classification performance compared to using PPGs.

View full abstract

Download PDF (2006K)
Analysis of Features Related to Compensation Amounts in Civil Litigation for Marital Infidelity

Ryohei OGAWA, Hitoshi SUZUKI, Yusuke FUKAZAWA

Session ID: 1D5-OS-24c-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D5OS24c01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study constructed a classification model for compensation amounts in civil litigation involving marital infidelity using machine learning. Features were designed based on prior knowledge and extracted from legal case documents using generative AI. SHAP analysis revealed that factors such as the duration of marriage, period of infidelity, and presence of children significantly related to the compensation amounts.

View full abstract

Download PDF (670K)
Behavioral Analysis and Intervention Optimization of β-δ Discounting Agents in Progress-based Tasks

Yasunori AKAGI, Naoki MARUMO, Takeshi KURASHIMA

Session ID: 1D5-OS-24c-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D5OS24c02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (677K)
Predicting the Number of Visitors to Stores Using Meta-Learning

Mimura TOMOHIRO, Ryo YAMADA

Session ID: 1D5-OS-24c-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D5OS24c03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (695K)
Acquiring Cooperative Behavior Through Rewards in Multi-Agent Path Finding

HARUTO SUGAWARA, HIROYUKI TODA

Session ID: 1D5-OS-24c-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1D5OS24c04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (494K)
Optimization Method for Processes with Adjustment, by Multilayer Regression Networks with Measured Feedback

Yukinari HANDA, Taisuke SHIRAISHI

Session ID: 1E3-GS-10-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E3GS1001

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In the manufacturing industry, an adjustment mechanism is sometimes provided in the design of a product with a large degree of individual difference. The term “individual difference” refers to the variation in the final performance of a product caused by the accumulation of intersecting parts and variations in the assembly process. By providing an adjustment mechanism, it is possible to absorb such individual differences by making adjustments during the final performance test. However, such adjustment process depend on the skill of the operator, because the initial conditions vary for each individual. Especially when there are multiple adjustment parameters and final performance parameters, the adjustment process becomes very complicated and the adjustment patterns increase exponentially. In this study, we propose a modeling method that combines a regression algorithm, causal inference, mathematical optimization, and measured value feedback to represent individual differences for the purpose of optimizing the adjustment process. The measured value feedback is a process to compensate for the bias by computing the assignment of measured values to the regression equation. This method can learn efficiently from a small number of machines samples, and succeeds in creating a model that can predict the optimal point in the adjustment of an machine with an unknown individual difference.

View full abstract

Download PDF (864K)
Quantification of Traffic Accident Risk Based on Traffic Shockwave Propagation Using Marked Hawkes Process

Ryoma NAKAMURA, Masaki MATSUDAIRA, Daisuke OKUYA

Session ID: 1E3-GS-10-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E3GS1002

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In order to contribute to the reduction of traffic accidents, we are researching a method to capture the signs of a traffic accident and quantify the risk of an accident based on these conditions. Based on the quantified risk of accidents, we believe that this method will be useful in preventing traffic accidents by alerting drivers. Previous research has shown that traffic shockwave propagation caused by multiple vehicles braking in chain reaction is highly associated with the occurrence of traffic accidents. In this paper, we propose a risk quantification method using a marked Hawkes process with the coefficient of variation of speed as a mark, focusing on the fact that the speed in that space changes violently in a time series when traffic shockwave propagation occurs, and report the results of applying the proposed method to actual traffic probe data. As a result of evaluation on real data, we confirmed that our method can appropriately quantify accident risk according to the occurrence of traffic shockwave propagation. We also confirmed that in some cases where accidents actually occurred, the risk increased before the occurrence of the accident by capturing traffic shockwave propagation.

View full abstract

Download PDF (1029K)
Proposal for a Method to Determine Important Findings to Reduce Oversights in Medical Records

Satoshi KAMEGAI, Tatsuo KITAHASHI, Ryoichi TANAKA

Session ID: 1E3-GS-10-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E3GS1003

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (645K)
Music Generation Reflecting Valence and Arousal

Shuta TANABE, Kenichi FUKUI, Noriko OTANI, Masayuki NUMAO

Session ID: 1E3-GS-10-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E3GS1004

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (954K)
Consistency of Ordinal Annotations in Textual Embeddings

YO EHARA

Session ID: 1E3-GS-10-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E3GS1005

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (472K)
A Study of Extracting Hierarchical Knowledge Structures from Textbooks in High School Information Science “Information I”

Kazuya KIKUTANI, Munehiko SASAJIMA

Session ID: 1E4-OS-3a-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E4OS3a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (390K)
Completion of ambiguous expressions in cooking recipes using large language models

Akane UEDA, Kazushi OKAMOTO, Kei HARADA, Atsushi SHIBATA, Koki KARUBE

Session ID: 1E4-OS-3a-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E4OS3a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

User-generated recipes posted on recipe websites often contain expressions that can be interpreted in various ways by readers or that hinder the accurate reproduction of the dish. Such ambiguous expressions contribute to the difficulty of understanding recipes. The aim of this study is to clarify ambiguous expressions specific to cooking recipes and to provide clear alternatives that can be easily understood by readers. Specifically, we identified and categorized ambiguous expressions using a questionnaire, designed prompts for large language models based on the results, and proposed a method for complementing recipes using retrieval-augmented generation. Analysis of the questionnaire results reveals that ambiguous expressions include omissions, modifiers, among others. Furthermore, cooking experiments demonstrated that the readability of recipes complemented by the proposed method is improved, although the overall validity of the complemented recipes remains uncertain.

View full abstract

Download PDF (586K)
Zero-shot Automated Essay Scoring via Pairwise Comparisons with Large Language Models

Takumi SHIBATA, Yuichi MIYAMURA

Session ID: 1E4-OS-3a-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E4OS3a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (614K)
Using LLM for Profile Generation Towards Recommender System Based on Virtual Users

YASUFUMI TAKAMA, Hiroki SHIBATA

Session ID: 1E4-OS-3a-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E4OS3a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper discusses the applicability of LLM (Large Language Model) for generating virtual user profiles. Recommender systems require users’ personal information such as their tastes and interaction histories, which could raise privacy concerns. To realize a recommendation without collecting users’ personal information, this paper proposes the concept of an explainable recommendation interface using virtual user profiles. By examining what users gave high/low ratings to target items, users can determine whether to accept/reject recommendations. This paper discusses the possibility of this concept based on the profiles generated by LLM and a pilot study with a prototype interface.

View full abstract

Download PDF (575K)
Generation and evaluation of product usage scenarios with large language models

Madoka HAGIRI, Kazushi OKAMOTO, Kei HARADA, Atsushi SHIBATA, Koki KARU ...

Session ID: 1E4-OS-3a-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E4OS3a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recommender systems are widely used in e-commerce to enhance user convenience. Complementary recommendation is a technology that suggests combinations of products intended to improve convenience when used together. However, complementary relationships can be ambiguous, making it difficult to provide a clear definition. Therefore, we aim to develop a complementary recommendation system based on product usage scenarios using a large language model (LLM). By incorporating product usage scenarios, it is expected that the complementary recommendations will be supported by clear evidence. In this study, we conducted an experiment in which we input only the names of product categories into LLM (GPT-4o-mini), which then generated usage scenarios for these categories. The generated scenarios were subsequently manually evaluated. The experimental results confirm that approximately 85% of the generated scenarios were considered valid.

View full abstract

Download PDF (279K)
Creation of Purpose-Oriented Word Summaries for Text Analysis Using ChatGPT

Kota NAGAO, Wataru SUNAYAMA, Shun HATTORI

Session ID: 1E5-OS-3b-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E5OS3b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (612K)
Analysis of the relationship between number of likes and uttered words in video contents

Kaname TAKIOKA, Megumi YASUO, Junjie SHAN, YOKO NISHIHARA

Session ID: 1E5-OS-3b-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E5OS3b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (279K)
Analysis of Viewer Feedback on Video Streaming: The Relationship Between Chat Features and Peak Viewership

Shunsuke ITOH, Kuon TANAKA, Haruka MATSUKURA, Yuji NOZAKI, MAKI SAKAMO ...

Session ID: 1E5-OS-3b-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E5OS3b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1154K)
Extraction of Viewpoint-Oriented Major Opinions from Comment Sets Using ChatGPT

Atsuya KOMORI, Wataru SUNAYAMA, Shun HATTORI

Session ID: 1E5-OS-3b-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E5OS3b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In websites where many people can post comments, such as video-sharing platforms, analyzing the diverse range of submitted opinions can potentially serve as a basis for decision-making. However, as the quantity and variety of comments increase, summarizing these opinions manually becomes impractical. Moreover, the appropriate method of summarization may differ depending on the specific decision at hand. Therefore, this study proposes a method to extract major opinions from a set of comments by using ChatGPT for each perspective deemed necessary for the analysis. Through experiments, we verified that the proposed method can generate different summaries tailored to each perspective.

View full abstract

Download PDF (672K)
Self-Disclosure in VTuber's Free Talk Stream: Extraction and Analysis with Large Language Models

Kuon TANAKA, Shunsuke ITOH, Haruka MATSUKURA, Yuji NOZAKI, Maki SAKAMO ...

Session ID: 1E5-OS-3b-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1E5OS3b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, VTubers (Virtual YouTubers) have become more popular, with the number of VTubers exceeding 20,000. However, few methods find favorite VTubers, and understanding the appeal of a new VTuber takes time. This study aimed to facilitate the quick discovery of a VTubers' appeal by extracting self-disclosure from their chatting streams. First, we collected two chatting streams from each of the 96 randomly selected VTubers. Next, based on previous research and qualitative analysis by LLMs, we developed 31 self-disclosure items, including "reflection on experience" and "current goal." We then used GPT-4o-mini to classify whether transcriptions of the chat streams contained these self-disclosure items. We compared some of the results with human annotations for validation. As a result, over 80% of VTuber chatting streams contained self-disclosure. Additionally, while items related to “goal” and “VTuber activity” were extracted with high accuracy, items such as “interest” and “personality” had lower accuracy. In conclusion, LLM has made it possible to analyze VTuber chatting streams.

View full abstract

Download PDF (1135K)
Prospects for AI Technologies Required for a Human-Centred Future Society

Daisuke KAJI, Heishiro TOYODA, Sawa TAKAMUKU, Kiyosumi KIDONO, Hiroyuk ...

Session ID: 1F3-OS-40a-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F3OS40a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In the future society, AI technology will be required to pursue convenience and efficiency but will also be required to have the ability to respond to changes in individual well-being and societal demands. In this paper, we propose three basic requirements that AI systems must have in order to increase the happiness of diverse individuals and realize a rich, people-centered future society: "co-evolution between humans and AI", "continuous improvement activities" built by humans and AI to support co-evolution, and "AI governance/AI alignment" for these efforts to function properly. We then examine the basic requirements required for each layer. Furthermore, we focus on "plausibility" as an important keyword that plays an important role in each of these layers and improves the acceptability of these activities and discuss its meaning and role.

View full abstract

Download PDF (534K)
Dynamic Origin-Destination Distribution Estimation with Extended LDA Model

Hideki HAYASHI, Hirotaka KAJI, Kazushi IKEDA

Session ID: 1F3-OS-40a-02
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F3OS40a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Today, accurate forecasting of urban transportation demand is becoming increasingly important. Although large volumes of mobility data are required for such forecasting, privacy considerations often mean that only aggregated statistics are available. If the generative model of mobility data can be estimated from these statistics, it becomes possible to generate pseudo mobility data, which can be beneficial for transportation demand forecasting. In this study, we propose a method for estimating model parameters from statistical data under the assumption that mobility data are generated by the Latent Dirichlet Allocation (LDA). The effectiveness of the proposed method was demonstrated through experiments using real data from a bike-sharing system.

View full abstract

Download PDF (1156K)
Trajectory Prediction of Multi-pedestrians Interacting in Shared Spaces with Vehicles: A Particle-Based GNN Model

Wei MANMAN, Onishi MASAKI, Yin YINGJIE

Session ID: 1F3-OS-40a-03
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F3OS40a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (1079K)
Tailoring AI Systems to Meet Requirements

Fuyuki ISHIKAWA

Session ID: 1F3-OS-40a-04
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F3OS40a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In software engineering, principles and techniques have been discussed to analyze the requirements that each system must meet, or the risks it should avoid, and to carry out activities such as development, quality assurance, and operation based on those analyses. Similarly, in AI systems, it is essential not only to meet general requirements but also to address perspectives that are critical to the system's stakeholders through activities such as testing and performance tuning. This presentation will introduce efforts in testing and performance tuning for systems such as autonomous driving systems, focusing on an approach called Search-Based Software Engineering, which utilizes metaheuristic optimization.

View full abstract

Download PDF (305K)
Design of Serious Games for Understanding Resilient and Responsible AI

Hirotaka KAJI, Jun KURIBAYASHI, Hikaru OTANI, Arisa EMA

Session ID: 1F3-OS-40a-05
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F3OS40a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

As AI becomes more widespread, the importance of responsible AI and AI governance is increasing in response to various risks. AI developers and users need to acquire the knowledge to understand and practice. In this study, we focus on serious games, which are a promising way to understand social issues, and we design a card game that allows players to experience the necessity of responsible AI and resilience. As an AI strategy team member in a company or local government, players manage AI-related resources to improve operational efficiency. At the same time, they learn to balance performance and governance by dealing with incidents and crises that occur. We consider the similarities and differences between corporate and local government versions in the design.

View full abstract

Download PDF (871K)
A Framework for Developing Reliable Machine Learning Systems Including Risk Analysis and its Application to an Automobile

Toshiya OKUBO, Jati Hiliamsyah HUSEN, Nobukazu YOSHIOKA, Naoyasu UBAYA ...

Session ID: 1F4-OS-40b-01
Published: 2025
Released on J-STAGE: July 01, 2025

DOIhttps://doi.org/10.11517/pjsai.JSAI2025.0_1F4OS40b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (774K)

Register with J-STAGE for free!