Journal of Advanced Computational Intelligence and Intelligent Informatics

Regular Papers

Pedestrian Re-Recognition Based on Spatiotemporal Transformer Skeleton Contrastive Learning and Feature Optimization

Yanru Jia, Yuanyuan Zhang, Yilun Gao

Article type: Research Paper
2025Volume 29Issue 6 Pages 1249-1261
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1249

JOURNAL OPEN ACCESS

Show abstractHide abstract

Person re-identification is an important task in computer vision, aimed at achieving cross-camera identity confirmation by identifying and matching the same pedestrian under different cameras. However, when traditional image-based methods are affected by factors such as lighting changes, occlusion, and changes in viewing angles, the advantages of skeleton data become increasingly apparent. Existing methods typically use primitive body joint design skeleton descriptors or learn skeleton sequence representations, but they often cannot simultaneously simulate the relationships between different body components, and rarely model skeleton information from both temporal and spatial dimensions. Therefore, in this paper, we propose a universal skeleton contrastive learning method based on the spatiotemporal Transformer (Space-time Transformer, StFormer). The method first adopts the Space-time Attention (S-T Attention) mechanism and achieves relationship modeling of spatiotemporal features by stacking multiple S-T Attention blocks. Secondly, to improve the important clues for extracting data features from the model, a Feature Refinement Box (FR Box) was proposed. Finally, we purpose a unique prompt learning mechanism (P-Study) which utilizes the spatiotemporal context of graph nodes to prompt skeleton graph reconstruction and help capture more valuable patterns and graph semantics.

View full abstract

Download PDF (1474K)
Enhanced Stance Detection for Arabic Tweets

Abeer Almasoudi, Muhammad Arif, Ahlam Hashem, Esraa Samkari

Article type: Research Paper
2025Volume 29Issue 6 Pages 1262-1272
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1262

JOURNAL OPEN ACCESS

Show abstractHide abstract

Social media platforms are becoming increasingly integrated into daily life, enabling individuals to express their beliefs and perspectives. Stance detection is an automated process for determining the viewpoint of a text on a particular topic; this is in high demand because of the increasing number of texts on social media. Most stance detection research has focused on the English language. Recent efforts have been made to generate datasets for stance detection in languages other than English. However, no comparable initiatives exist in Arabic. This study utilized the MAWQIF dataset and sequential multi-task learning (SMTL), which combines sarcasm detection and sentiment analysis tasks to enhance stance detection performance. In our SMTL, task dependency modeling is employed to establish a flow of information from the sarcasm task to the sentiment task, and then from these two tasks to the stance detection task, ensuring that the stance detection task benefits from the information derived from sarcasm and sentiment. Many experiments have been conducted to investigate the performance of multi-target classifiers in comparison to target-specific classifiers, as well as the impact of training order on the task. State-of-the-art performance is achieved by the multi-target SMTL model, which utilizes a hierarchical task weighting technique. This model was initially trained on the sarcasm task and then further trained on sentiment. The average F1 score on the testing dataset was 88.3%, which was better than the published results. Our study highlights the importance of multi-task learning in stance detection and investigates the relationship between sentiment, sarcasm, and stance.

View full abstract

Download PDF (1007K)
An Improved Byte Pair Encoding Method for Tibetan

Kalzang Gyatso, Sonam Tshering, Tashi Norbu, Nyima Tashi, Tong Xiao, J ...

Article type: Research Paper
2025Volume 29Issue 6 Pages 1273-1282
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1273

JOURNAL OPEN ACCESS

Show abstractHide abstract

Byte pair encoding (BPE) plays a crucial role in natural language processing tasks by effectively reducing vocabulary redundancy and alleviating the out-of-vocabulary problem. However, when applied to Tibetan language tasks, the standard BPE method fails to fully exploit its advantages due to the unique characteristics of the Tibetan script. As a result, some subwords in the vocabulary that violate standard Tibetan orthographic conventions, introduce noise into the model and degrade downstream task performance. To address this issue, this paper investigates the agglutinative nature of Tibetan words and proposes an improved BPE approach specifically designed for Tibetan. We apply the method to a Tibetan-Chinese machine translation system and evaluate its effectiveness through a series of experiments. The results demonstrate that the proposed method not only corrects malformed subwords and enhances translation quality, but also significantly reduces vocabulary size, laying a solid foundation for future research in Tibetan word representation and downstream natural language processing applications. Our method achieves consistent improvements in BLEU scores across most test sets, with gains exceeding 2 points in the best case.

View full abstract

Download PDF (957K)
A Method for Recognizing Entities in Power News Texts Based on Dependency Syntactic Parsing

Yun Wu, Xinru Liu, Yan Du, Jieming Yang, Zhenhong Liu, Kai Yang, Ziyi ...

Article type: Research Paper
2025Volume 29Issue 6 Pages 1283-1291
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1283

JOURNAL OPEN ACCESS

Show abstractHide abstract

Addressing the challenge that news texts in the power field often contain numerous professional terms and many new terms are generated every year, which are difficult to accurately identify using general named entity recognition methods, this paper proposes an entity recognition model for power texts based on dependency syntactic analysis (SYN-BiLSTM-CRF). This model first generates power text word vectors and inputs them into a forward LSTM for feature extraction. Simultaneously, dependency syntactic parsing is performed on the power text, and the syntactic information vectors are fused with the output of the forward LSTM before being input into a backward LSTM. This enhances the model’s ability to learn inter-word dependency relations by incorporating additional syntactic features. Finally, CRF is employed to obtain the predicted NER labels. The experiments demonstrate that the proposed SYN-BiLSTM-CRF model achieves an F1-score of 85.36% on power-related texts, representing a 2.78% improvement over the baseline BiLSTM-CRF model (82.58%). Additionally, it attains a recall of 89.06%, outperforming the BERT model’s recall (87.59%). These results prove that the proposed method significantly enhances entity recognition accuracy in this specialized domain.

View full abstract

Download PDF (1746K)
Color Visual Expression in Product Packaging Design Based on Feature Fusion Network

Zemei Liu

Article type: Research Paper
2025Volume 29Issue 6 Pages 1292-1304
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1292

JOURNAL OPEN ACCESS

Show abstractHide abstract

To improve the effectiveness of visual representation schemes for product packaging colors and enhance the competitiveness and attractiveness of products, this study proposed to construct a color intent dataset based on multiple fusion algorithms. On this basis, a product packaging color visual expression model based on a conditional deep convolution generative adversarial network was constructed. The empirical analysis of the model showed that its accuracy was 94.36% and the running time was 50.2 seconds, indicating better performance than the comparative models. In addition, this study also rated its satisfaction and found that the average satisfaction score of the model was 9.2, higher than the other comparative models. The proposed visual expression model of product packaging color based on conditional depth convolution generative adversarial network had better accuracy and computing speed performance than other comparison models. The color scheme provided by this model better met user needs compared to other models and had great potential for application, providing a certain theoretical basis for product packaging color design.

View full abstract

Download PDF (594K)
Vehicle Traffic Prediction and Analysis Using Hybrid Deep Learning Technique

Betty Paulraj, Shilpi Sharma, Narayan C. Debnath, Ramzi A. Haraty

Article type: Research Paper
2025Volume 29Issue 6 Pages 1305-1310
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1305

JOURNAL OPEN ACCESS

Show abstractHide abstract

The main objective of this study is to predict road traffic in unconditional situations in real time. The advancement of machine learning techniques paves the way for the prediction of traffic well in advance. This system is completely trained on the dataset of vehicle services with pre-scheduled timings. This advanced prediction improves the travel experience at large. As the system has to operate on the time-based data in an unconditional and unplanned environment, the effectiveness of the system is evaluated using deep learning models. The results obtained after testing were presented and a comparative analysis of the effectiveness of each model in terms of accuracy and correctness were studied.

View full abstract

Download PDF (5142K)
Dual-Branch Residual Network for Enhanced Steel Plate Fault Detection

Hao Chen, Jiaxin Lu

Article type: Research Paper
2025Volume 29Issue 6 Pages 1311-1318
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1311

JOURNAL OPEN ACCESS

Show abstractHide abstract

Steel plate fault detection plays a crucial role in industrial manufacturing. However, the inherent complexity of steel plate fault data and the redundancy of certain features pose significant challenges for effective feature extraction. To address these challenges, we propose a dual-branch residual network model (DRNM), which utilizes a two-branch architecture. The first branch processes the original data through a convolutional neural network to capture local feature details, and the second branch leverages feature mapping to extract the spatial relationships within the data. To enhance feature extraction depth and model performance, residual networks are integrated into both branches, allowing for deeper network training and the capture of richer feature representations. The proposed dual feature extraction mechanism significantly improves the model’s representational power and fault-detection accuracy. Experimental results on a public dataset demonstrate that DRNM achieves state-of-the-art performance, with average recall and F1 score of 90.11% and 90.79%, respectively, substantially outperforming existing methods.

View full abstract

Download PDF (1508K)
Linear Transformer Based U-Shaped Lightweight Segmentation Network

Hongli He, Changhao Sun, Zhaoyuan Wang, Yongping Dan

Article type: Research Paper
2025Volume 29Issue 6 Pages 1319-1328
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1319

JOURNAL OPEN ACCESS

Show abstractHide abstract

The widespread development and application of embedded medical devices necessitate the corresponding research in lightweight, energy-efficient models. Although transformer-based segmentation models have shown promise in various visual tasks, inherent challenges, including the lack of inductive bias and an overreliance on extensive training data, emerge when striving for optimal model efficiency. By contrast, convolutional neural networks (CNNs), with their intrinsic inductive biases and parameter-sharing mechanisms, enable a reduction in the number of parameters and a focus on capturing local features, thereby lowering computational costs. However, reliance solely on transformers does not meet the practical demands of lightweight model efficiency. Hence, the integration of CNNs with transformers presents a promising research trajectory for constructing efficient and lightweight networks. This hybrid approach leverages the strengths of CNNs in feature extraction and the ability of transformers to model global dependencies, achieving a balance between model performance and efficiency. In this paper, we propose MobileViTv2s, a novel lightweight segmentation network that integrates CNNs with a linear transformer. The proposed network efficiently extracts local features via CNNs, whereas transformers adeptly manage complex feature relationships, thereby facilitating precise segmentation in intricate contexts such as medical imaging. The model demonstrates significant potential and applicability in the advancement of lightweight deep learning models. Experimental results revealed that the proposed model achieved up to a 14.34-fold improvement in efficiency, a 9.91-fold reduction in the number of parameters, and comparable or superior segmentation accuracy, while achieving a markedly lower Hausdorff distance.

View full abstract

Download PDF (1905K)
A Fast Depression Detection Method Based on AKRCC-KNN Model

Jing Kan, Wei Tong, Bichen Wu, Yongchun Ma, Kewei Chen

Article type: Research Paper
2025Volume 29Issue 6 Pages 1329-1341
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1329

JOURNAL OPEN ACCESS

Show abstractHide abstract

In order to proceed the fast detection of depression with EEG (electroencephalogram) signal, this study proposed a so-called AKRCC-KNN model for automatic and accurate diagnosis. Based on the multi-channel EEG signal with pre-processing, there is a novel approach focusing on the feature extraction, in which the PLI (phase lag index) of EEG signals is calculated as the feature; moreover, the feature selection algorithm (so-called AKRCC) is innovatively integrated with AKRC (altered Kendall’s rank correlation coefficient) method for feature re-arrangement and convergence determination for feature selection, in order to improve the selective feature’s accuracy with limited computation expense. Hence the entire process of detection of depression with enhanced performance is listed as follows. Firstly, the PLI of EEG signals is computed to obtain their functional connectivity networks. AKRCC algorithm is then applied to rank PLI matrix elements by their discriminative power and determine optimal feature dimensionality through classification accuracy convergence monitoring. Finally, the selected multidimensional features are input into a KNN classifier for automatic classification. Extensive experiments on the MODMA dataset (24 major depression disorder patients, 29 healthy controls) demonstrate the model’s superior performance. With 1-second full-band EEG features, the AKRCC-KNN model achieves a state-of-the-art identification accuracy of 97.65% (specificity: 96.95%, sensitivity: 98.54%), surpassing existing methods. This indicates that the proposed depression detection model in this paper can achieve intelligent and rapid depression detection, providing an efficient, accurate, and diverse solution for clinical depression detection.

View full abstract

Download PDF (5279K)
Resource-Constrained and Time-Aware Reinforcement Learning Framework for Sustainable Fertilization Strategies

Muhammad Alkaff, Abdullah Basuhail, Yuslena Sari, Kamal Jambi

Article type: Research Paper
2025Volume 29Issue 6 Pages 1342-1357
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1342

JOURNAL OPEN ACCESS

Show abstractHide abstract

Achieving sustainable fertilization is critical for balancing crop productivity with environmental stewardship and resource efficiency. However, conventional fertilization methods rely on fixed schedules and generalized routines, resulting in inefficient nitrogen use and environmental risks owing to over- or under-application. Sustainable fertilization requires adaptive strategies that optimize resource use while preserving long-term soil health and productivity. Reinforcement learning (RL) offers a promising alternative by continuously adapting fertilization strategies based on real-time data, such as soil conditions, crop growth stages, and weather patterns. This study introduced the time-aware, idle-biased, Lagrangian-based, and resource-constrained approach with proximal policy optimization (TILARC-PPO), a novel RL framework designed to adaptively optimize fertilization. TILARC-PPO integrates (1) idle-biased action selection to prevent unnecessary fertilization, (2) time-awareness to optimize decision timing, and (3) Lagrangian-based resource constraints to dynamically regulate nitrogen applications. Experimental results show that TILARC-PPO maintains a comparable grain yield with only a slight reduction of 7.93%, while reducing nitrogen consumption by 32% when compared to expert fertilization. Additionally, it achieved the highest nitrogen use efficiency (30.8 kg grain per kg N), surpassing both the expert-based and vanilla proximal policy optimization (PPO) approaches. TILARC-PPO further improved training stability and policy convergence by learning effective fertilization strategies within 300,000 timesteps. These findings highlight TILARC-PPO as a scalable, intelligent solution for sustainable precision agriculture, aligned with global efforts to enhance resource efficiency, maintain soil health, and promote sustainable food production.

View full abstract

Download PDF (1338K)
Finite Element Simulation of Hydraulic System Based on TFY-YH Parking Anti-Slip Device

Qunyan Xing, Yuchen Shi, Yongle Ju

Article type: Research Paper
2025Volume 29Issue 6 Pages 1358-1368
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1358

JOURNAL OPEN ACCESS

Show abstractHide abstract

Anti-slip devices are essential for ensuring railway parking safety in various settings including arrival and departure lines, intermediate stations, and locomotive depot access routes. With the advancement in railway technology, anti-slip systems have progressed significantly in automation and intelligence. Thereby, these have become a critical component of parking safety measures. The Hydraulic Anti-Slip Device for Railway Arrival-Departure Tracks by Signal & Communication Research Institute (TFY-YH) anti-slip system is hydraulically driven. This enables higher braking than other types of anti-slip systems. This study focuses on the TFY-YH anti-slip system. It develops a finite element simulation model to systematically analyze the displacement, arrival time, and speed of the piston rod, as well as the variations in the accumulator oil pressure during the braking and releasing processes. The pressure-holding performance of the system is evaluated, and methods for improvement are proposed. Furthermore, the relationship between the stopping position of the train and the braking/releasing time is established. A method for determining the stopping position is also introduced. These observations provide a theoretical foundation for the intelligent control and monitoring of anti-slip systems. Additionally, these provide insights for enhancing railway safety and operational efficiency.

View full abstract

Download PDF (4029K)
Research on the Knowledge Representation Method of Power News Text Based on Time Hyperplane

Yun Wu, Ziyi Wang, Yan Du, Jieming Yang, Xinru Liu, Zhenhong Liu, Kai ...

Article type: Research Paper
2025Volume 29Issue 6 Pages 1369-1376
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1369

JOURNAL OPEN ACCESS

Show abstractHide abstract

To address the problem that static knowledge graphs cannot evolve over time, which leads to the conflict between entities and relations in the process of knowledge representation, this paper combines the temporal hyperplane with the translation model in knowledge representation, and proposes a knowledge representation method based on the temporal hyperplane for power news texts. First, multiple temporal hyperplanes are established and the temporal factor is added to the scoring function of the translation model; then, the entities and relation of the power news are projected onto the temporal hyperplanes, and the optimal knowledge representation is determined according to the loss function. Taking the power news text as an example, this algorithm well resolves the time-related conflicts in the power news text, and the comprehensive indexes are significantly improved on the time-related triplets.

View full abstract

Download PDF (1872K)
Research on the Application of Intelligent Object Recognition System in Classroom Attendance and Student Behavior Analysis in Universities

Lin Yang, Gai Hang, Xuehui Zhang

Article type: Research Paper
2025Volume 29Issue 6 Pages 1377-1389
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1377

JOURNAL OPEN ACCESS

Show abstractHide abstract

In order to better understand the overall learning status of students, evaluate classroom attendance in universities, and promote the high-quality development of higher education, analyzing student behavior in the classroom is extremely important. Existing research on student behavior recognition primarily focuses on identifying individual students, with insufficient attention given to their interactions with surrounding objects. To more accurately detect the required targets within a classroom, this paper proposes a multi-target detection method based on an improved YOLOv5s model. Firstly, to address the issue of small-scale targets such as mobile phones and pens in the classroom scene, which have limited extractable features, this paper adopted measures to optimize the network structure. Secondly, considering the interference of irrelevant information such as classroom backgrounds and varying student attire in real classroom environments, which makes it difficult for the network to extract effective features, the triplet attention mechanism was introduced to enhance the network’s feature extraction capability. Finally, experiments were conducted on both a self-constructed dataset and a public dataset. The experimental results show that the mAP values of the improved network increased by 4.5 percentage point and 3.2 percentage point, respectively, verifying the effectiveness of the improvements.

View full abstract

Download PDF (8010K)
Anomaly Detection in Borehole Strain Data with CNN and Frequency-Aware VAE

Xiaolong Wei, Qingjie Liu, Zhian Pan

Article type: Research Paper
2025Volume 29Issue 6 Pages 1390-1401
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1390

JOURNAL OPEN ACCESS

Show abstractHide abstract

Earthquakes pose significant threats to human life and property, often triggering secondary disasters such as landslides, mudslides, and collapses. Although earthquake prediction remains a challenging global scientific problem, the study of seismic precursors plays a critical role. Earthquakes occur due to the instability and fracturing of underground rock layers when concentrated stress exceeds their strength limit. Borehole strainmeters provide direct observations of crustal strain, making pre-earthquake strain anomaly detection essential for precursor studies. In recent decades, variational autoencoders (VAEs) have been widely adopted for anomaly detection due to their powerful denoising capabilities. However, traditional VAE-based methods face difficulties in capturing both long-term heterogeneous patterns and fine-grained short-term trends simultaneously. To address this, we propose a new approach combining convolutional neural networks (CNN) with frequency-aware conditional VAE frameworks. The CNN extracts spatial dependencies among strain observation components, while frequency analysis improves temporal feature capture. By incorporating a target attention mechanism, our model selects the most relevant frequency-domain information to enhance reconstruction of both long-term and short-term trends. Experimental results on borehole strain data show that our model outperforms state-of-the-art methods. These findings confirm the practical value of our approach in overcoming current VAE-based detection limitations and emphasize the importance of integrating spatial and frequency representations in seismic precursor studies.

View full abstract

Download PDF (651K)
Posture Estimation and Obstacle Detection by Embedding Distance-Measuring Sensors in a Spherical Mobile Robot

Ryota Nakagawa, Yuki Ueno

Article type: Research Paper
2025Volume 29Issue 6 Pages 1402-1409
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1402

JOURNAL OPEN ACCESS

Show abstractHide abstract

In this study, we developed a method for designing spherical mobile robots that can detect obstacles and can estimate posture using embedded laser-ranging sensors in a spherical shell. A mobile robot used in commercial facilities must be safe for humans, and must also be able to detect and avoid obstacles. Spherical mobile robots are considered suitable for such purposes as operating near humans. However, the installation of external measurement sensors in spherical mobile robots can reduce their mobility. In this study, we developed a novel installation method for embedding external laser-ranging measurement sensors in a spherical shell. This method can successfully install sensors without compromising on the capability such as mobile characteristics of the robot. In addition, we proposed a posture estimation method using embedded laser-ranging sensors only. Moreover, we proposed a method for classifying point-cloud data into floors or obstacles. The validity of these methods was verified by simulations, which demonstrated that the methods could detect obstacles and estimate the robot’s posture, even in the presence of sensor noise.

View full abstract

Download PDF (6544K)
Type-2 Fuzzy Robust Regression with Two-Step Construction

Yoshiyuki Yabuuchi

Article type: Research Paper
2025Volume 29Issue 6 Pages 1410-1416
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1410

JOURNAL OPEN ACCESS

Show abstractHide abstract

Regression models that describe the relationship between independent and dependent variables are widely used owing to their simple structure, ease of handling, and ease of interpretation. One such model is the interval fuzzy regression model that uses fuzzy sets. This model represents the possibility distribution of an analysis target in terms of interval predictions. Generally, the vagueness of a dependent variable is represented by the intervals of type-1 fuzzy sets. However, these observations contain errors, and the interval predictions are considered vague. Therefore, research has been conducted on fuzzy regression using type-2 fuzzy sets. A type-2 fuzzy regression model has been proposed to illustrate possibility distribution of an analysis target through possibilistic and necessity regressions. To investigate reliable and robust fuzzy regression models, this study constructs a type-2 fuzzy robust regression model for possibilistic regression, which illustrates the possibility distribution of the analyte, and a fuzzy robust regression model, which illustrates the robust possibility of the analyte. Numerical examples are used to confirm the characteristics of the proposed model and identify future research directions.

View full abstract

Download PDF (181K)
Interactive Image Caption Generation Reflecting User Intent from Trace Using a Diffusion Language Model

Satoko Hirano, Ichiro Kobayashi

Article type: Research Paper
2025Volume 29Issue 6 Pages 1417-1426
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1417

JOURNAL OPEN ACCESS

Show abstractHide abstract

This study proposes an image captioning method designed to incorporate user-specific explanatory intentions into the generated text, as signaled by the user’s trace on the image. We extract areas of interest from dense sections of the trace, determine the order of explanations by tracking changes in the pen-tip coordinates, and assess the degree of interest in each area by analyzing the time spent on them. Additionally, a diffusion language model is utilized to generate sentences in a non-autoregressive manner, allowing control over sentence length based on the temporal data of the trace. In the actual caption generation task, the proposed method achieved higher string similarity than conventional methods, including autoregressive models, and successfully captured user intent from the trace and faithfully reflected it in the generated text.

View full abstract

Download PDF (2065K)
Generating Natural Language Sentences Explaining Trends and Relationships of Two Time-Series Data

Yukako Nakano, Ichiro Kobayashi

Article type: Research Paper
2025Volume 29Issue 6 Pages 1427-1442
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1427

JOURNAL OPEN ACCESS

Show abstractHide abstract

We propose a method for generating natural language explanations that describe trends and relationships between two time-series data. To address this task, it is essential to analyze the dynamic behavior of both time series and generate textual explanations based on the analytical outcomes. We developed a model that extended the vanilla Transformer architecture to better capture the temporal features relevant to explanation generation. To train the model, we constructed a synthetic, domain-agnostic dataset that simulated time-series patterns and interactions. We conducted two experiments to evaluate the effectiveness of the proposed approach using the synthesized datasets. The first experiment focused on generating explanations for the time-series trends. The results demonstrated that our model could generate accurate and coherent explanations with high accuracy. The second experiment addressed more complex scenarios in which the model was required to answer questions regarding the relationship between two interacting time-series. Although the model initially struggled to achieve high accuracy in this task, we observed that step-by-step training significantly improved its performance. These findings highlight both the potential and current limitations of Transformer-based approaches for interpretable time-series analysis.

View full abstract

Download PDF (4354K)
Mobile-YOLO: A Lightweight YOLO for Road Crack Detection on Mobile Devices

Anjun Yu, Yixiang Gao, Yonghua Xiong, Wei Liu, Jinhua She

Article type: Research Paper
2025Volume 29Issue 6 Pages 1443-1453
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1443

JOURNAL OPEN ACCESS

Show abstractHide abstract

Road crack detection is critical for ensuring road traffic safety, extending the service life of roads, and improving the efficiency of road maintenance management. However, the traditional YOLOv8 model, when applied on mobile devices, faces challenges, such as high network complexity, significant computational resource demands, and slow inference speeds owing to limited computational resources. To address these issues, this paper proposes a model tailored for mobile terminals—Mobile-YOLO. By incorporating the universal inverted bottleneck module and the multi-query attention mechanism, the model significantly reduces network complexity while enhancing computational efficiency for mobile deployment, making it well suited for real-time detection requirements in embedded systems and vehicle-mounted patrol devices. Experimental results showed that Mobile-YOLO improves detection accuracy by 4.1%, mAP50 by 2.76%, and mAP50-95 by 2.56% compared with the baseline YOLOv8, achieving an inference speed of 113 fps, outperforming other lightweight models. Experiments on the NVIDIA Jetson Nano platform further validated its excellent inference performance and low false positive rate, providing an efficient solution for real-world road crack detection in resource-constrained environments.

View full abstract

Download PDF (6276K)
An Empirical Study of Consumer Purchase Behavior in Live Commerce

Lu Jiang, Yukio Kodono

Article type: Research Paper
2025Volume 29Issue 6 Pages 1454-1463
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1454

JOURNAL OPEN ACCESS

Show abstractHide abstract

In recent years, live-commerce has rapidly developed into a major force in China’s digital economy, offering consumers real-time interaction and convenient shopping experience. Despite its popularity and economic impact, concerns over product quality, streamer professionalism, and platform governance highlight the need for a deeper understanding of what drives consumer behavior in the context. This study examines how influencer-led live commerce affects consumer purchase intentions through the stimulus-organism-response model. Specifically, it explores the influence of content characteristics, streamer attributes, product features, and platform-related factors. Using a sample of 212 respondents, this research employs seemingly unrelated regression analysis to test the hypotheses. The results indicate that the entertainment of live commerce content, streamer interactivity, and streamer fame significantly enhance perceived utilitarian value, perceived trust, and hedonic value. In turn, perceived trust and hedonic value strongly drive consumer purchasing behavior, while perceived utilitarian value has limited impact. Interestingly, factors, such as product quality, product features, platform regulation, and after-service exhibit weaker or inconsistent effects. These findings offer practical insights for live commerce platforms, streamers, and marketers aiming to optimize engagement and enhance consumer purchasing intentions.

View full abstract

Download PDF (1129K)
Design Adaptive Non-Linear PID Control Using Reinforcement Learning for Optimal Autonomous Greenhouse Microclimate Regulation

Hayder M. Abbood, Seyed Hamed Seyed Alagheband, Amer Matrood Imran, Sa ...

Article type: Research Paper
2025Volume 29Issue 6 Pages 1464-1483
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1464

JOURNAL OPEN ACCESS

Show abstractHide abstract

A greenhouse (GH) system is a multi-input/multi-output (MIMO), dynamic, and energy-intensive environment that requires precise control for achieving optimal plant growing while minimizing energy consumption. Energy consumed by a GH system has indirect effects on the overall profitability. Determining optimal setpoints for a GH environment is challenging for traditional proportional–integral–derivative (PID) controllers, particularly for MIMO systems to reduce their energy consumption. A hybrid approach combining reinforcement learning (RL) with a radial basis function neural network (RBFNN), called neuro-tuner optimization (NTO), is proposed to control the GH climate and maximize energy efficiency. Herein, RL was developed using Q-learning, a popular algorithm, exhibiting high performance with a root mean square error of 0.013 in the testing phase and a correlation coefficient of 1. To validate and improve the effectiveness of the proposed NTO system, it was compared with another optimal control strategy. The proposed NTO system showed good results and enhanced energy efficiency by 19.7% (average), whereas the optimal control strategy improved energy efficiency by 3.6% (average). These results demonstrate the ability of the proposed NTO system to handle non-linear dynamic systems and enhance their overall performance. Thus, the proposed NTO system met the study objectives by improving the PID performance of a dynamic system while maximizing its energy efficiency.

View full abstract

Download PDF (4605K)
Multiscale Attention-Based Model for Image Enhancement and Classification

Mingyu Guo, Tomohito Takubo

Article type: Research Paper
2025Volume 29Issue 6 Pages 1484-1499
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1484

JOURNAL OPEN ACCESS

Show abstractHide abstract

Fine-grained image classification plays a crucial role in various applications, such as agricultural disease detection, medical diagnosis, and industrial inspection. However, achieving a high classification accuracy while maintaining computational efficiency remains a significant challenge. To address this issue, in this study, enhanced DetailNet (EDNET), a convolutional neural network (CNN) model designed to balance fine-detail preservation and global context understanding, was developed. EDNET integrates multiscale attention mechanisms and self-attention modules, enabling it to capture both local and global information simultaneously. Extensive ablation studies were conducted to evaluate the contribution of each module and EDNET was compared with the mainstream benchmark models ResNet50, EfficientNet, and vision transformers. The results demonstrate that EDNET achieves highly competitive performance in terms of accuracy, F1-score, and area under the receiver operating characteristic curve, while maintaining an optimal balance between parameter count and inference efficiency. In addition, EDNET was tested in both high-performance graphics processing unit (NVIDIA RTX 3090) and resource-constrained environments (Jetson Nano simulation). The results confirm that EDNET is deployable on edge devices, achieving an inference efficiency comparable to that of EfficientNet, while outperforming traditional CNN models in fine-grained classification tasks.

View full abstract

Download PDF (4896K)
Intelligent Prediction of Uniaxial Compressive Strength Based on Multi-Source Information Fusion

Quanxin Li, Hongbo Dong, Youzhen Zhang, Jun Fang, Wangnian Li

Article type: Research Paper
2025Volume 29Issue 6 Pages 1500-1506
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1500

JOURNAL OPEN ACCESS

Show abstractHide abstract

Uniaxial compressive strength (UCS) is a fundamental indicator of formation hardness, playing a vital role in evaluating geomechanical properties during drilling process. Accurate UCS prediction enables real-time assessment of formation conditions, contributing to improved drilling safety and efficiency. This study proposes a multi-source data fusion approach that integrates vibration data with conventional drilling parameters to enhance UCS prediction accuracy. To address the inconsistency in time scales between the two data sources, a piecewise cubic Hermite interpolation method is applied for temporal alignment. The fused dataset is then used to retrain an extreme learning machine model. Experimental validation is conducted using data collected from a surface drilling test site. Results demonstrate that the proposed method significantly outperforms single-source prediction models, highlighting the effectiveness of vibration-assisted data fusion in real-time UCS estimation.

View full abstract

Download PDF (731K)
Multi-Task Prediction Method for User Behavior Utilizing Transformers

Ke Li, Huan Fang, Chifeng Shao, Yifei Xu

Article type: Research Paper
2025Volume 29Issue 6 Pages 1507-1516
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1507

JOURNAL OPEN ACCESS

Show abstractHide abstract

Pattern recognition of user behavior plays an important role in extracting portrait, and critical event sequence extraction is also valuable. Addressing these two issues, an approach of Transformer-based multi-task user behavior prediction is investigated in this paper, named LogSeqTrans model, to enhance the accuracy of predicting user actions and extract critical event sequences. By serializing user behavior data and employing information entropy to identify key events, the proposed LogSeqTrans model processes data through an embedding layer, an encoding layer, and an output layer. The embedding layer converts events and their temporal information into high-dimensional vectors. The encoding layer leverages a multi-head self-attention mechanism to capture sequence dependencies, while the output layer simultaneously predicts behavior types, event occurrence times, and remaining durations. Experimental results demonstrate that the proposed model surpasses other models across three open datasets. Specifically, the average accuracy of LogSeqTrans model for the next activity prediction task significantly outperforming alternative models; Similarly, in the tasks of predicting the next activity occurrence time and the remaining time, the mean absolute errors of LogSeqTrans model are all outperforming comparative models. These results indicate that LogSeqTrans is highly effective in multi-task prediction and capturing complex sequence patterns.

View full abstract

Download PDF (765K)
A PID Control System for Lower-Limb Rehabilitation Robot with a Function for Pedal Torque Estimation

Yue Jing, Zewen Wang, Qiwei Wu, Jinhua She, Seiichi Kawata

Article type: Research Paper
2025Volume 29Issue 6 Pages 1517-1529
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1517

JOURNAL OPEN ACCESS

Show abstractHide abstract

This article presents a proportional integral derivative (PID) control system for lower-limb rehabilitation robot that not only features satisfactory control performance for the pedal angle but also provides a function for pedal torque estimation. Nonlinear state feedback simplifies the stability analysis and control system design. The stability condition of the closed-loop system is derived based on a Lyapunov function. The PID controller ensures that the pedal angle tracks the reference trajectory. The equivalent input disturbance (EID) method in the control system was compared with the disturbance observer (DOB) and extended state observer (ESO) methods in terms of pedal torque estimation performance. The simulation results indicated that the EID method achieved a root mean square error of 0.37 N·m with 47.6% and 51.8% improvements over the DOB and ESO methods.

View full abstract

Download PDF (8162K)
Does Robot Clothing Really Help? User Preferences and Effects in Simulated Domestic Scenarios

Kazunari Yoshiwara, Kazuki Kobayashi

Article type: Research Paper
2025Volume 29Issue 6 Pages 1530-1540
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1530

JOURNAL OPEN ACCESS

Show abstractHide abstract

This study investigated the impact of clothing on robot appearance, particularly in scenarios where a single robot performs multiple tasks. Clothing depicts an individual’s role and capability toward others. Applying this effect to robot appearance design can enable an individual robot to express roles and capabilities suitable for multiple tasks. This makes it a potentially effective approach to robot appearance design. Our experiments first investigated the user acceptance of robots wearing clothing. Subsequently, we investigated the impact of robot attire on user behavior and impressions in a shared workspace. Our results indicate that users prefer robots to wear clothing only during cooking. In addition, in scenarios wherein robots and users share a workspace while performing different tasks, robot clothing is associated with negative user impressions. These observations indicate that even when users express a preference for clothed robots, the actual effect may not be positive and can vary depending on the task and context of use. Therefore, the decision to clothe a robot requires cautious consideration.

View full abstract

Download PDF (7116K)
Inclusion–Exclusion Integral Neural Networks: A Framework for Explainable AI with Non-Additive Measures

Yoshihiro Fukushima, Katsushige Fujimoto, Simon James, Aoi Honda

Article type: Development Report
2025Volume 29Issue 6 Pages 1541-1551
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1541

JOURNAL OPEN ACCESS

Show abstractHide abstract

This study introduces the inclusion–exclusion integral neural network (IEINN) and its accompanying open-source Python library, a novel framework designed to enhance the interpretability of neural networks by leveraging non-additive monotone measures and polynomial operations. The proposed architecture integrates the inclusion–exclusion integral into the network structure, enabling direct extraction of structured information from the learned parameters. We develop a Python-based IEINN library, implemented using PyTorch, to facilitate efficient model training and integration. The library includes several preprocessing methods for parameter initialization, such as normalization based on minimum and maximum values, percentiles, and standard deviations, which enhance training stability and convergence. Additionally, the framework supports various computational operations, including t-norms and t-conorms, allowing flexible modeling of interactions among input variables. The proposed framework is publicly available as an open-source library on GitHub (AoiHonda-lab/IEI-NeuralNetwork), facilitating further research and practical applications in explainable AI.

View full abstract

Download PDF (471K)
On User’s Reception of Local Explanation: An Argumentation Analysis

Nguyen Duy Hung, Thanaruk Theeramunkong, Van-Nam Huynh

Article type: Research Paper
2025Volume 29Issue 6 Pages 1552-1564
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1552

JOURNAL OPEN ACCESS

Show abstractHide abstract

A local explanation method (LE) in explainable artificial intelligence (XAI) is basically a two-step procedure: first construct a naively explainable model approximating the black-box model in need of explanations; then extract an explanation from the approximate model. Since an expert user knows that the extracted explanation aims to be just analogous to the target/ideal explanation, the expert user has to use analogical arguments to transfer certain properties observed on the former to the latter. In this paper, assuming an expert user whose knowledge satisfies certain conjectures, we reconstruct the structures “reason therefore conclusion” of these analogical arguments and study conditions for ensuring the truth of the reason, conditions for ensuring that the conclusion follows necessarily from the reason, as well as counter-arguments the user has to consider. It is argued that the presented findings shed light on the internal reasoning of an expert user at the end of User-LE dialogue. Broadly speaking, the paper suggests a promising direction to extend existing explanation methods, which are system-centered (focusing on generating explanations), to user-centered XAI which must attend to user’s receptions as well.

View full abstract

Download PDF (469K)
Revised Margin-Maximization Method for Nearest Prototype Classifier Learning

Yoshifumi Kusunoki, Tomoharu Nakashima

Article type: Research Paper
2025Volume 29Issue 6 Pages 1565-1576
Published: November 20, 2025
Released on J-STAGE: November 20, 2025

DOIhttps://doi.org/10.20965/jaciii.2025.p1565

JOURNAL OPEN ACCESS

Show abstractHide abstract

This paper proposes a revised margin-maximization method for training nearest prototype classifiers (NPCs), which are known as an explainable supervised learning model. The margin-maximization method of our previous study formulates NPC training as a difference-of-convex (DC) programming problem solved via the convex-concave procedure. However, it suffers from issues related to hyperparameter sensitivity and its inability to simultaneously optimize both classification and clustering performances. To overcome these drawbacks, the revised method directly solves the margin-maximization problem using a method of sequential second-order cone programming, without DC programming reduction. Furthermore, it integrates clustering loss from the k-means method into the objective function to enhance prototype placement in dense data regions. We prove that the revised method is a descent algorithm, that is, the objective function decreases in each update of the solution. A numerical study confirms that the revised method addresses the drawbacks of the previous method.

View full abstract

Download PDF (1566K)

Register with J-STAGE for free!