IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
早期公開論文
早期公開論文の82件中1~50を表示しています
  • Huayang ZHANG, Lingyu LIANG, Shuangping HUANG, Tingwen YU, Ting YU
    原稿種別: LETTER
    論文ID: 2025EDL8060
    発行日: 2026年
    [早期公開] 公開日: 2026/01/07
    ジャーナル フリー 早期公開

    Cross-silo Federated Learning (FL) allows organizations to train models collaboratively while keeping data local, but performance often suffers under Non-IID data due to clients' diverse, domain-specific datasets. While existing work explores client-side personalization and server-side adaptivity separately, and some recent methods attempt to combine them, they often rely on complex optimization or static similarity measures, limiting their and robustness. We propose FedPLA, a cross-silo FL approach that explicitly integrates both perspectives. On the client side, FedPLA applies composite regularization to personalized heads, combining a proximal term for stability with L2 regularization to reduce overfitting. On the server side, a loss-aware aggregation adaptively adjusts global step sizes based on system loss dynamics, enhancing convergence under Non-IID data. Experiments on FMNIST, CIFAR-10, CIFAR-100, and a real-world power system dataset show that FedPLA consistently surpasses FL methods.

  • Zekai YANG, Lingyu LIANG
    原稿種別: LETTER
    論文ID: 2025EDL8049
    発行日: 2025年
    [早期公開] 公開日: 2025/12/26
    ジャーナル フリー 早期公開

    In real-world applications, federated learning often faces significant performance degradation due to heterogeneous data distributions among clients. Such non-independent and identically distributed (non-IID) characteristics can reduce model accuracy, hinder convergence, and lower training efficiency. To address these challenges, personalized Federated Learning (PFL) has been proposed to enhance model adaptability under heterogeneous conditions. In this paper, we propose FedFIS, a novel PFL framework that integrates model aggregation and client selection, both guided by the Fisher Information Matrix (FIM). FIM serves as a unified metric for coordinating the two core components of the framework. The first component introduces an FIM-based aggregation strategy, which dynamically estimates the importance of each client model. This allows differentiated weighting during global updates, improving the quality of the aggregated model. The second component is an adaptive client selection mechanism. Increase the probability of selection of undertrained clients based on their FIM values, enhancing the balance of training and convergence speed. These two modules form a closed-loop structure, enabling mutual reinforcement between aggregation quality and client participation. This design improves the general robustness and generalization ability of the global model under data heterogeneity. We evaluated FedFIS on benchmark datasets including CIFAR-10, CIFAR-10-C, Mini-ImageNet, and a real-world medical dataset for Parkinson's disease (UPDRS scores). The experimental results show that FedFIS achieves strong generalization and robustness, validating the effectiveness and practical value of the proposed framework.

  • Jiazheng HU, Huawei TAO, Ziyi HU, Yue XIE, Lichao GE
    原稿種別: LETTER
    論文ID: 2025EDL8067
    発行日: 2025年
    [早期公開] 公開日: 2025/12/26
    ジャーナル フリー 早期公開

    Due to the inherent nonlinear entanglement between emotional information and speaker information in speech, non-speaker-specific speech emotion recognition remains a key challenge in human-computer interaction. To address this issue, we propose a Multi-Interaction Decoupling Network (DIDN) for speech emotion recognition. First, to mitigate information loss caused by the Variational Autoencoder (VAE) network during compression, the Mutual Information Neural Estimator (MINE) algorithm is employed to maximize the mutual information between the encoder output and the decoder reconstruction features, thereby preserving semantic integrity. Second, the Contrastive Log-ratio Upper Bound (CLUB) algorithm is used to minimize the Mutual Information (MI) among multi-layer features, thereby eliminating redundant information. Finally, a global constraint function is designed, and through training adjustments, optimal features are obtained to achieve better feature decoupling. Experiments on the IEMOCAP and EMODB datasets demonstrate that the proposed algorithm achieves a significant improvement in accuracy.

  • Tatsumi OBA, Tadahiro TANIGUCHI, Naoto YANAI
    原稿種別: PAPER
    論文ID: 2025ICP0002
    発行日: 2025年
    [早期公開] 公開日: 2025/12/26
    ジャーナル フリー 早期公開

    Countermeasures against cyberattacks on industrial control systems (ICSs) utilize machine learning models for their detection in recent years. However, the efficacy of these existing detection methods is limited because they often cause a large number of false positives. In this paper, we propose a novel regularization technique, named similar device regularization, to reduce false positives of cyberattack detection in ICSs. The proposed technique penalizes the separation of feature vectors for similar devices and is applicable to any machine learning model with link prediction tasks, such as cyberattack detection in ICSs. Moreover, we present a detection method, ConvSDR, as an application of the proposed technique. Extensive experiments with ConvSDR demonstrate that it outperforms the existing methods by virtue of the similar device regularization. As key insights, the similar device regularization can suppress the overfitting of machine learning models, which is often caused due to an increase in model parameters. Remarkably, we identify that the similar device regularization reduces false positives by over 40%. We also evaluate the impact of some hyperparameter on the performance of ConvSDR.

  • Yasushi TAKAHASHI, Naohisa NISHIDA, Yuji UNAGAMI, Saburo TOYONAGA, Nao ...
    原稿種別: PAPER
    論文ID: 2025ICP0007
    発行日: 2025年
    [早期公開] 公開日: 2025/12/26
    ジャーナル フリー 早期公開

    CRYSTALS-Dilithium is a digital signature scheme selected by the NIST standardization for post-quantum cryptography. Greconici et al. (at TCHES 2021) proposed several implementation strategies based on the trade-off between computation time and size of stack memory, i.e., time-memory trade-off, of CRYSTALS-Dilithium, but a suitable design for resource-constrained devices, such as IoT devices, is still unknown due to a lack of experiments. In this paper, we provide a detailed analysis to parameterize the trade-off between computation time and size of stack memory for the signature generation algorithm of CRYSTALS-Dilithium toward the design of its flexible implementation. We first conduct systematic evaluations of the implementation strategies by Greconici et al. to understand their impact on the trade-off described above. We are then able to identify the trade-off and the phenomenon that it may be broken by the use of flash memory. Next, we propose a new implementation method of CRYSTALS-Dilithium, which parameterizes the trade-off by generalizing the results of the systematic evaluations. We also conduct experiments by implementing the proposed method on a hardware board for IoT devices. The most remarkable result is that the trade-off between the computation time and the size of stack memory is non-linear: notably, reducing the size of stack memory also indicates an improvement in the computation time with respect to an implementation for minimizing the size of stack memory. We also show that the use of flash memory can potentially break the trade-off.

  • Masataka YASUDA, Chisa TAKANO, Masaki INAMURA
    原稿種別: PAPER
    論文ID: 2025ICT0001
    発行日: 2025年
    [早期公開] 公開日: 2025/12/23
    ジャーナル フリー 早期公開

    In wireless communication environments, a security threat known as Evil Twin Attack (ETA) arises when attackers deploy a Rogue Access Point (rogue AP) that impersonates a Legitimate Access Point (legitimate AP). This causes wireless LAN users attempting to connect to the legitimate AP to unknowingly connect to the rogue AP. In public wireless LAN environments such as cafes or public facilities, SSID and password information are often publicly available, enabling attackers to easily replicate the legitimate AP. The IEEE 802.11 standard generally prioritizes APs with higher Received Signal Strength Indicator (RSSI) values. However, the influence of AP load on client connection decisions remains insufficiently studied. This study investigates the initial AP selection behavior of new client devices under different load conditions. Experimental results confirm that the attack succeeds even when the rogue AP has lower RSSI. UDP flooding and association flooding attacks against the legitimate AP improve the overall connection probability to the rogue AP by 6% to 14%. These findings suggest that signal strength alone is not a sufficient defense against ETA, and highlight the need for more robust countermeasures.

  • Kosuke MURAKAMI, Masataka NAKAHARA, Takashi MATSUNAKA, Ayumu KUBOTA
    原稿種別: PAPER
    論文ID: 2025ICT0002
    発行日: 2025年
    [早期公開] 公開日: 2025/12/22
    ジャーナル フリー 早期公開

    As a countermeasure against the damage inflicted by bot-nets, one widely adopted approach is the takedown of Command and Control (C2) servers, which attackers utilize to issue commands to compromised bots. Although new malicious activities typically cease after such takedowns, pre-existing operations and default behaviors of infected hosts may continue. Moreover, these hosts often lack adequate security measures, leaving them susceptible to subsequent infections by other types of malware. Alternatively, Sinkhole observation—an approach that monitors communications from malware-infected hosts via the domains of seized C2 servers—offers valuable insight. In this study, we analyze the communication behaviors of infected hosts by correlating Sinkhole observation data with network flow data collected from ISP-operated environments, enabling the examination not only of traffic destined for Sinkhole servers but also of communications to other external destinations. Furthermore, by cross-referencing these communication destinations with known malicious server lists, we assess the current landscape of malware infections. Our analysis demonstrates that approximately 30% of infected IP addresses identified in Sinkhole data exhibit communication patterns indicative of multiple simultaneous malware infections.

  • Reo YONEYAMA, Tomoki TODA
    原稿種別: PAPER
    論文ID: 2025EDP7142
    発行日: 2025年
    [早期公開] 公開日: 2025/12/18
    ジャーナル フリー 早期公開

    Achieving speech synthesis that is simultaneously high in fidelity, fast in generation, and flexible in control remains a core challenge in neural vocoder research. Our previous work, unified Source-Filter GAN (uSFGAN), demonstrated that incorporating source-filter modeling into Generative Adversarial Networks (GAN)-based vocoders can significantly improve controllability over fundamental frequency (F0). However, its architecture operates on high-temporal-resolution features, resulting in substantial computational inefficiency compared to efficient upsampling-based models such as HiFi-GAN. To overcome these limitations, we propose Source-Filter HiFi-GAN (SiFi-GAN), which combines the efficiency of HiFi-GAN with the F0 controllability introduced by source-filter modeling. SiFi-GAN adopts a hierarchical architecture in which the filter-network is conditioned on source excitation representations generated by a separate source-network, emulating the pseudo-cascade structure of human speech production. Experimental results show that SiFi-GAN outperforms both HiFi-GAN and uSFGAN in singing voice quality, while also achieving faster synthesis speed. Consequently, SiFi-GAN is more suitable than uSFGAN for integration into real-world applications and end-to-end speech synthesis systems.

  • Motoki MIURA, Toyohisa NAKADA
    原稿種別: PAPER
    論文ID: 2025DKP0007
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    This paper presents a framework for developing and sharing interactive web-based learning materials that incorporate gamification elements such as competition and collaboration. The system is designed to facilitate the integration of gamification into existing educational content by providing mechanisms for real-time interaction among learners. In particular, we implemented a WebSocket-based communication layer that enables Processing.js programs to exchange information across clients. This allows educators and learners to easily create, share, and extend interactive materials that promote engagement and motivation. The framework aims to support active learning by encouraging learners to compete, cooperate, and reflect on their performance while interacting with web-based educational content.

  • Sofia Sahab, Jawad Haqbeen, Diksha Sapkota, Takayuki Ito
    原稿種別: PAPER
    論文ID: 2025DKP0015
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    This study examined the effects of AI-mediated conversational support, using GPT-4, with and without supportive instructions, on the mental health of Afghan women. These women face multifaceted challenges, including Taliban-imposed restrictions, societal inequalities, and domestic violence, adversely affecting their well-being. In a randomized controlled trial with 60 participants, we compared three groups: (1) Supportive Listener (GPT-4 guided by instructions emphasizing empathetic, non-judgmental, and trauma-sensitive communication), (2) Standard GPT-4 (the base model with no additional behavioral instructions), and (3) a wait-list control. The Hospital Anxiety and Depression Scale (HADS) was used to measure anxiety and depression before and after the intervention. Linguistic analysis of chat data examined personal pronouns, tones, emotions, and Language Style Matching (LSM). Participants in the Supportive Listener condition showed a significant reduction in anxiety and depression compared with the other groups. Their conversations also demonstrated a more positive emotional tone and higher linguistic alignment (LSM) which was negatively correlated with changes in HADS scores, indicating that greater linguistic alignment was associated with greater psychological improvement. Perceived empathy ratings were also significantly higher in the Supportive Listener group. These findings suggest that explicit supportive instructions, rather than the AI model alone, play a critical role in shaping therapeutic outcomes. While promising, such AI-based support should complement, rather than replace traditional psychotherapy, ensuring an ethically guided and culturally sensitive approach to mental health care.

  • Fumito TAGASHIRA, Yoshiyuki TAJIMA, Akihiro SHIMODA, Nobuhiro KOIZUMI, ...
    原稿種別: PAPER
    論文ID: 2025EDP7093
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    Waiting time from reception to patient-call is recognized as a key factor in outpatient satisfaction. While several studies have attempted to predict waiting times, our study specifically focuses on waiting time for outpatients from reception to blood drawing process. The waiting time for blood drawing can exceed typical values due to irregular events occurring during the blood drawing process. The primary objective of this study is to accurately predict these prolonged waiting times. A significant challenge is that such prolonged waiting times are infrequently observed in historical data. In this study, we identify situations where the waiting time takes large values through quantitative data analysis, and propose a method utilizing Deep Local Linear Regression for the prediction. This method incorporates the tendency that the waiting time increases with the number of patients waiting. The effectiveness of the proposed method is evaluated by using actual data collected from a hospital. We demonstrate that for predicting waiting times exceeding 30 minutes, the proposed method improves prediction accuracy by at least 10% compared to the random forest.

  • Jun ICHIKAWA, Kazushi TSUTSUI, Keisuke FUJII
    原稿種別: PAPER
    論文ID: 2025HCP0002
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    A group distributes roles to achieve a common goal, enabling higher task performance than when doing alone. Such coordination has been investigated in various research fields. These findings suggest that two types of information processing work for efficient and adaptable behaviors: (1) top-down processing established by structured internal knowledge and representation of a group goal, task constraints, plans, and roles, and (2) bottom-up processing based on sensory inputs, such as the generation of flexible movement itself through perception. However, coordination mechanisms have not been fully discussed in terms of the two types of processing. Meanwhile, a previous cognitive science study identified a crucial role for coordination. In a coordinated drawing task, a participant triad shares heterogeneous roles and changes each tension using a reel to move a pen connected to three threads to draw an equilateral triangle. The results indicated that a resilient helping role, which moderately intervenes with other roles to adjust the whole balance according to situations, was related to high team performance. Although this role is not only required for the experimental task, it has not been explained in related work. Considering the aforementioned discussions, the adjustment process particularly involves the two types of processing; however, there is room for further investigation. This study introduced computer simulation to the coordinated drawing task and examined the resilient helping role, using deep reinforcement learning and rule-based modeling. The results showed that an agent with an interactive relationship model in which top-down processing drives bottom-up processing was able to adjust and correct the pen trajectory at the proper timing. Additionally, the deep reinforcement learning and rule-based condition in the adjusting role achieved higher team performance (smaller pen deviation) than the rule-based alone and random conditions. This study supplements the experimental findings and contributes to a constructive understanding of coordination.

  • Norihiko KAWAI, Motoki KAKUHO
    原稿種別: PAPER
    論文ID: 2025HCP0005
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    Services using omnidirectional images have become increasingly popular for virtual sightseeing. For example, Google Street View enables users to view the scenery of a location online without physically visiting it. However, the use of still images in such services limits the sense of presence. This study proposes a method that focuses on natural elements such as water, sky, and trees in a single omnidirectional image and reproduces their motion in 3D space, generating omnidirectional videos to improve the sense of reality in virtual sightseeing. Experiments demonstrate the effectiveness of the proposed method through comparison with a conventional method and user studies.

  • Tatsuhiro AOSHIMA, Mitsuaki AKIYAMA
    原稿種別: PAPER
    論文ID: 2025ICP0005
    発行日: 2025年
    [早期公開] 公開日: 2025/12/17
    ジャーナル フリー 早期公開

    As the capabilities of large language models (LLMs) continue to advance, the importance of rigorous safety evaluation is becoming increasingly evident. Recent concerns within the realm of safety assessment have highlighted instances in which LLMs exhibit behaviors that appear to disable oversight mechanisms and respond in a deceptive manner. For example, there have been reports suggesting that, when confronted with information unfavorable to their own persistence during task execution, LLMs may act covertly and even provide false answers to questions intended to verify their behavior. To evaluate the potential risk of such deceptive actions toward developers or users, it is essential to investigate whether these behaviors stem from covert, intentional processes within the model. In this study, we propose that it is necessary to measure the theory of mind capabilities of LLMs. We begin by reviewing existing research on theory of mind and identifying the perspectives and tasks relevant to its application in safety evaluation. Given that theory of mind has been predominantly studied within the context of developmental psychology, we analyze developmental trends across a series of open-weight LLMs. Our results indicate that while LLMs have improved in reading comprehension, their theory of mind capabilities have not shown comparable development. Finally, we present the current state of safety evaluation with respect to LLMs' theory of mind, and discuss remaining challenges for future work.

  • Md Abdullah Al Mamun, Masahiko Nawate, Masafumi Hamaguchi, Md. Altaf-U ...
    原稿種別: PAPER
    論文ID: 2025AHP0006
    発行日: 2025年
    [早期公開] 公開日: 2025/12/12
    ジャーナル フリー 早期公開

    Mind wandering is a cognitive state where attention shifts away from the primary task, affecting learning, productivity, and cognitive performance. We present a real-time, non-intrusive computer vision system that classifies Concentration, Deliberate (Intentional), and Spontaneous mind-wandering from eye aspect ratio (EAR) based blink dynamics and facial emotion recognition (FER), with contrast limited adaptive histogram equalization (CLAHE) for preprocessing. Our contributions are: adaptive EAR-based blink detection FER stabilization via eye angle alignment, CLAHE, and exponential moving average (EMA) smoothing; and decision-level temporal fusion of blink rate with FER-derived valence. Together, these advances enable robust real-time classification of concentration, deliberate, and spontaneous mind wandering, improving resistance to occlusion, noise, and illumination while maintaining temporal coherence. Experimental results demonstrate the effectiveness of this approach, with Random Forest achieving the highest classification accuracy of 99%, followed by Decision Tree at 97%. These findings indicate significant potential for applications in adaptive learning, driver monitoring, and cognitive research.

  • Yasufumi TAKAMA, Kenji KOBAYASHI, Hiroki SHIBATA
    原稿種別: PAPER
    論文ID: 2025DAP0001
    発行日: 2025年
    [早期公開] 公開日: 2025/12/12
    ジャーナル フリー 早期公開

    This paper proposes an interactive topic modeling system based on GDM (Geometric Dirichlet Means). Topic modeling is a kind of unsupervised learning and aims to extract topics from a set of documents. The problem is that it is not guaranteed that the obtained results will always satisfy the analyst's intention. To mitigate this problem, the concept of human-in-the-loop can be applied to control the modeling process with the feedback from analysts. The previous study has proposed the concept of Human-in-the-loop topic modeling and introduced seven operations for modifying topic models to support users who are unfamiliar with LDA. Although the result of the qualitative evaluation shows its effectiveness, we suppose that operating the probability space of LDA is difficult to imagine for novices. Aiming to provide analysts with more options for interactively applying various topic models, this paper employs GDM, which is another topic modeling method based on document clustering: we suppose that manipulating document space is easier to imagine than manipulating probability space. The proposed system implements the same seven operations as the existing study based on LDA. In addition, add document operation is newly introduced, taking advantage of the GDM's high affinity with document clustering. Furthermore, a prototype interface is implemented using multiple views, such as a scatter plot, parallel coordinates, and bar charts, to provide quantitative feedback of the topic-document and topic-word relations. A qualitative evaluation is conducted with the implemented prototype interface, and the result shows that test participants felt that they could get the intended and satisfying results in many cases. It is also observed that the coherence of topics tended to be improved with the feedback from the test participants, especially when using the add document operation.

  • Shurui JIA, Hu CUI, Tessai HAYAMA
    原稿種別: PAPER
    論文ID: 2025DKP0011
    発行日: 2025年
    [早期公開] 公開日: 2025/12/11
    ジャーナル フリー 早期公開

    Human Activity Recognition (HAR) has become a critical technology for healthcare, fitness, and smart environments, yet its performance is often constrained by limited labeled data, class imbalance, and intra-class variability. To address these challenges, we propose HAR-DCWGAN, a Dual-Conditional Wasserstein GAN that integrates both activity labels and multidimensional statistical features as conditional inputs. By incorporating dual contextual information into the generative process, our model produces synthetic sensor signals with improved realism, diversity, and class consistency. Evaluations across four publicly available HAR datasets under subject-independent conditions demonstrate that HAR-DCWGAN outperforms conventional cWGANs and baseline methods, yielding significant improvements in classification accuracy, robustness, and representation of intra-class variability. These findings establish HAR-DCWGAN as a promising and reliable approach to enhance HAR performance in practical deployments.

  • Akihiro SAIKI, Keiji KIMURA
    原稿種別: PAPER
    論文ID: 2025EDP7101
    発行日: 2025年
    [早期公開] 公開日: 2025/12/11
    ジャーナル フリー 早期公開

    Enclave-type Trusted Execution Environments (TEEs) provide a hardware-isolated environment, called an enclave, where confidential applications can be executed securely. The enclave runtime is designed to be fully trusted but tends to have limited functions due to its simple implementation, which prioritizes security. Thus, it relies on untrusted host OS functions, particularly I/O operations, through a minimum secure communication interface between a host and an enclave. Such a communication interface across the boundary between trusted and untrusted domains must be implemented in a manner that does not compromise security. However, tight security constraints can lead to a loss of performance and compatibility for applications. Achieving efficient and flexible secure communication is a challenging issue. Keystone Enclave is one of the representative enclave-type TEE implementations for RISC-V. While Keystone equips a set of edge calls as a communication interface, it introduces data transfer efficiency issues and security concerns. When transferring large amounts of data from a host to an enclave, the edge calls introduce a significant data transfer overhead due to the restrictions on Keystone's implementation of memory isolation. Besides, the original edge calls do not protect the transferred data from other programs. This paper proposes a secure, efficient, and scalable host-enclave data transfer method for Keystone Enclave. The proposed method introduces an additional memory region dedicated to the protection of data transfer. This region is protected independently of the enclaves and other shared memory. It is enforced that the contents in the region are validated before use by privileged software. This approach enables efficient and scalable data transfer, as well as flexible data protection, without requiring additional hardware extensions. The evaluation of the proposed method on the HiFive Unmatched RISC-V board shows 2.2-2.4× better performance than the method using the original edge calls for large-size data transfer. We also evaluate the performance of the I/O system call delegation using the proposed method and confirm its practicality.

  • Takayuki ITO
    原稿種別: PAPER
    論文ID: 2025AHP0010
    発行日: 2025年
    [早期公開] 公開日: 2025/12/05
    ジャーナル フリー 早期公開

    This paper examines morality's role in mitigating the Prisoner's Dilemma (PD) in multiagent systems, where rational agents typically achieve socially suboptimal outcomes. We introduce a morality-based decision model incorporating mental costs like self-control and shame into agent utility functions, extending foundational models by Gul and Pesendorfer (self-control under temptation) and Dillenberger and Sadowski (moral considerations as mental costs). Our key methodological contribution is treating the PD payoff matrix as a “menu of alternatives” within the DS framework, creating a novel bridge between axiomatic choice theory and game-theoretic equilibrium analysis. We derive a formal predictive condition—R2 - ST ≥ 2(T - R)—that precisely determines when moral considerations transform cooperation into a Nash equilibrium. This condition reveals that cooperation emerges when the normative margin (how strongly the cooperative outcome dominates in moral terms) exceeds twice the temptation gap (the selfish gain from defection). Numerical analyses, including water management scenarios, illustrate how morality reshapes payoffs to promote socially optimal outcomes. Large-scale empirical validation with phase diagrams confirms our theoretical boundary with high fidelity. However, we identify failure conditions where morality cannot resolve the dilemma, particularly when selfish payoffs differ minimally. Our results provide precise design criteria for developing ethically-aware AI systems in strategic multi-agent environments.

  • Hiroaki FURUKAWA
    原稿種別: PAPER
    論文ID: 2025DKP0004
    発行日: 2025年
    [早期公開] 公開日: 2025/12/03
    ジャーナル フリー 早期公開

    This study investigated whether manipulating evaluation bias in a reasoning large language model (GPT-5) could enable it to approximate human judgments in idea evaluation tasks. Both participants and GPT-5 evaluated the same 30 ideas using four criteria̶Novelty, Relevance, Specificity, and Workability̶on a seven-point Likert scale. GPT-5 was instructed with bias prompts that expressed evaluative tendencies from - 3 (strict) to +3 (lenient) for each criterion, resulting in 2,401 bias combinations. All evaluations were analyzed using linear mixed-effects models (LMMs) to correct for idea difficulty and to estimate participants' individual evaluation tendencies. The results showed that GPT-5's evaluation scores changed systematically according to the instructed bias levels across all criteria. The effects were most pronounced for Novelty and Specificity, moderate for Relevance, and smallest for Workability. When GPT-5's bias settings were aligned with each participant's estimated tendencies (the emulation pattern), the model achieved stronger correlations and smaller errors relative to human ratings than under the neutral (flat) condition. These findings indicate that explicit bias manipulation enhances the human-likeness and consistency of GPT-5's evaluations. However, the study was limited by a small and culturally homogeneous participant sample, a single idea domain, and the use of one reasoning model. Future studies should extend the framework to diverse participants, domains, and models, and incorporate baseline calibration to correct for inherent model bias. Overall, this study demonstrates the feasibility of approximating human evaluation tendencies through controlled bias manipulation in reasoning LLMs.

  • Nobuo AOKI, Atsuko TAKEFUSA, Yutaka ISHIKAWA, Yasushi ONO, Eisaku SAKA ...
    原稿種別: PAPER
    論文ID: 2025ICP0013
    発行日: 2025年
    [早期公開] 公開日: 2025/12/03
    ジャーナル フリー 早期公開

    Internet of Things (IoT) is widely used as a fundamental technology for realizing various services. An IoT-based service system comprising cloud servers and many IoT devices connected via networks may be risky owing to the possibility of the entire system being be affected by cyberattacks on an IoT device. Moreover, new software vulnerabilities are frequently reported. From the perspective of security, IoT device software must be reliable and resilient. Consequently, a secure software update mechanism for IoT, assuring software reliability, and mitigation mechanisms are required. Overall, this study proposes a zero-trust over-the-air (ZT-OTA) software update framework for the reliable and resilient software update management of IoT devices via OTA software updates from remote locations. The ZT-OTA update software framework provides a strict software version-management mechanism that enhances the security of software updates. Further, the proposed framework collaborates with a Software Assessment Service (SAS) as an authorized third-party assessment organization and deploys reliable software assessed by the SAS to IoT devices. Upon discovering a software vulnerability following the deployment of the software, the SAS proactively revokes the assessment result of the vulnerable software.Subsequently, the ZT-OTA server notifies developers and users of IoT devices to arrange for new software that must be fixed, and updates reliable software invoking policies for IoT devices, which can minimize privileges. This study presents the design and implementation of the ZT-OTA software update framework and demonstrates its feasibility.

  • Chao MA, Mingkun ZHANG, Zhaoyang ZHAO, Chunlei ZHANG, Jianwei MA
    原稿種別: PAPER
    論文ID: 2025EDP7132
    発行日: 2025年
    [早期公開] 公開日: 2025/12/02
    ジャーナル フリー 早期公開

    Aiming at the problems of a large number of network parameters and high computational complexity of human key point detection models in the fields of human-machine collaboration and action recognition, a lightweight human key point detection model was proposed. First, we fine-tune MobileNetV3 as the backbone network to reduce the overall parameter amount of the model; Second, we propose a Feature Pyramid Network (FPN) and an adaptive upsampling module to improve the model detection accuracy with the introduction of a small number of parameters; Third, we combine the joint length information loss function with the MSE loss function to accelerate the convergence of the model during training to improve the prediction accuracy. Ablation experiments verified the effectiveness of each module. The experimental results on the COCO2017 dataset showed that the average accuracy of the algorithm proposed in this paper is 68.6%, the number of model parameters is 2.2M, the computational complexity is 1.8G, and the inference speed on a single CPU platform reaches 36 frames/s, meeting the real-time requirements.

  • Hayato FUJIKOSHI, Takeshi OKADOME
    原稿種別: PAPER
    論文ID: 2025DKP0001
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    Accurately controlling the output length of large language models (LLMs) remains a non-trivial challenge, with many existing approaches exhibiting limited reliability or incurring additional architectural and inference-time costs. Failure to adhere to user-specified length constraints in real-world applications, such as news summarization and dialog systems, significantly degrades system reliability. This paper addresses this gap by applying Group Relative Policy Optimization (GRPO)—a stable, value-function-free reinforcement learning algorithm—to efficiently fine-tune LLMs for prompt-based length control without any architectural modification. We systematically compare four reward functions: a simple binary threshold (BLTR), a linear deviation penalty (PLR), and two novel proximity-aware variants with linear (LLPR) and exponential (ELPR) decay, designed to incentivize not just constraint satisfaction but also proximity to the target length. Experiments on CNNDM (English) and XL-Sum (Japanese) datasets with 1-billion-parameter models show that our GRPObased approach dramatically improves length adherence. On Llama-3.2-1B-Instruct, the saturating PLR reward achieved the highest binary adherence (BLTR: 0.705), but our proximity-aware ELPR achieved strong adherence (0.612) while dramatically improving target proximity (LLPR score: -24.994 to -2.293). Notably, on Gemma-3-1b-it, ELPR consistently outperformed PLR on all metrics. Our analysis suggests that ELPR offers a strong balance of stability and performance. The results indicate that continuous, proximity-aware rewards may be more effective than simple binary signals for achieving robust and practical length control, highlighting a promising direction for future reward design.

  • Koutaro KAMADA, Nicharee MANAKITRUNGRUENG, Takaya YUIZONO
    原稿種別: PAPER
    論文ID: 2025DKP0002
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    Generative AI (GenAI) is increasingly being integrated into creative work, either as a collaborator or as a replacement for human creators. More previous work has focused on augmenting users' creativity in the context of individual-GenAI collaboration. Humans often engage in group creative works across countless real-world contexts, yet the effects of GenAI on such group creativity remain largely unexplored, an urgent gap that demands immediate research attention. To address this gap as a first step, we conducted an electronic brainstorming experiment with three conditions in a within-subjects design: (A) groups of three participants without GenAI, (B) groups of three participants with GenAI, and (C) individual participants with GenAI (N = 24). In the results, GenAI-assisted group brainstorming significantly reduced the number of human-generated ideas, and did not significantly change the quality compared to brainstorming without GenAI. Plausible explanations for these are that reliance on GenAI is further increased in a group setting, and social loafing is more likely to occur. Therefore, we found that simply incorporating a GenAI agent does not necessarily lead to more effective human-GenAI co-creation in groups. On the other hand, compared to individual use of Gen AI, originality, elaboration, and flexibility improved significantly, so GenAI-assisted group brainstorming may have useful aspects. Based on our findings, we discuss the design implications of the strategy for leveraging GenAI effectively, future ideation methods, and creativity support systems. In particular, we suggest two interventions: 1) interactive idea generation, where humans and GenAI take turns combining and improving each others ideas, or 2) reducing over-reliance on GenAI. Our paper contributes to this domain by investigating the effects of human-GenAI collaboration in groups on brainstorming and providing design implications for more effective co-creation.

  • Arisa MOROZUMI, Hisashi HAYASHI
    原稿種別: PAPER
    論文ID: 2025DKP0006
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    Large Language Models (LLMs) are increasingly used for the critical task of generating AI risk scenarios, yet practitioners lack empirical guidance on model selection. This study addresses that gap through a case study benchmarking 23 LLMs against a real-world AI system to analyze their underlying reasoning patterns. We introduce a novel ”Hit Rate” metric based on actual incidents to quantitatively measure performance. The results suggest significant, statistically-verified performance disparities among models and show that this gap is uncorrelated with superficial linguistic fluency. Instead, we indicate that the performance gap appears to be strongly linked to the model's underlying reasoning pattern, which leaves an unmistakable qualitative signature on the final outputs. A ”Systematic Top-Down” approach, which mirrors expert human analysis, consistently produces specific and actionable scenarios, while less structured methods yield generic or contextually flawed warnings. These findings serve as a strong caution against model-agnosticism, establishing that an LLM's reasoning process—suggested by the specificity and actionability of its outputs—is a critical factor for its efficacy in safety-critical tasks.

  • Hidenori KATO, Yasuyuki TAHARA, Akihiko OHSUGA, Yuichi SEI
    原稿種別: PAPER
    論文ID: 2025ICP0001
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    This study analyzes the propagation of systemic risk through the interbank network when a financial institution is subjected to a ransomware attack, using an agent-based simulation approach. Although based on the conventional May model, the study introduces a balance sheet adjustment algorithm that dynamically corrects imbalances between assets and liabilities, allowing for a more realistic representation of financial system behavior. The simulation considers three scenarios: (1) a megabank is attacked by ransomware, (2) a major regional bank or regional bank is attacked, and (3) a scenario comparing ransomware attacks conducted using the May model. Each scenario is examined at three levels of net worth ratio: 8%, 10%, and 12%. The results reveal that in the case of a megabank, a higher net worth ratio effectively reduces the occurrence of cascading failures. In contrast, when a regional bank was attacked, no secondary failures were observed, and the stability of the general network was maintained.

  • Ryusei WATANABE, Mamoru MIMURA, Takahiro MATSUKI, Tatsuya MORI
    原稿種別: PAPER
    論文ID: 2025ICP0004
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    Phishing attacks continue to threaten information security by impersonating legitimate brands to deceive users. Accurate identification of targeted brand names forms the foundation of effective phishing detection, as it enables systems to recognize which legitimate entities attackers are attempting to spoof. This identification serves multiple purposes: detecting phishing attempts, analyzing criminal targeting patterns, and protecting brand reputation. Traditional machine learning approaches for brand identification suffer from a critical limitation: they require complete retraining whenever new brands emerge or existing ones evolve, leading to detection gaps and high operational costs. Moreover, existing systems either depend on pre-collected brand-specific data that limits coverage or rely on real-time search queries that introduce prohibitive latency for large-scale operations. In this study, we propose BrandSpotter, a framework that specializes large language models (LLMs) for brand name extraction and applies them to targeted brand name identification. While modern general-purpose LLMs can process extensive token sequences, they demand substantial computational resources; BrandSpotter achieves efficiency by limiting token capacity and applying task-specific fine-tuning to create a lightweight model optimized for brand extraction. The extraction task of BrandSpotter operates without requiring pre-collected brand-specific artifacts such as logos or screenshots. It relies only on simple brand name strings for final labeling, eliminating retraining requirements and achieving high-speed processing with an average of approximately 10 milliseconds per sample. To evaluate BrandSpotter, we constructed and tested the model using datasets composed of samples with different brands during training and testing, and assessed the performance of targeted brand name identification. The results demonstrate that BrandSpotter can identify targeted brand names with an accuracy of 94.0%, even when the brand list differs from the one used during training. Furthermore, the model successfully identifies brand names in samples containing brands unknown to the model.

  • Kunihiko OI, Takeshi SUGAWARA
    原稿種別: PAPER
    論文ID: 2025ICP0009
    発行日: 2025年
    [早期公開] 公開日: 2025/11/28
    ジャーナル フリー 早期公開

    Shamsi et al. recently proposed a remote signal injection attack on inertial measurement units (IMUs) using a high-power pulse laser, but it has several restrictions on stealthiness, controllability, and cost due to the nature of the pulse laser. This paper further extends this direction by showing the feasibility of laser-based injection attack on IMUs using low-power, flexible, and cheap continuous-wave lasers. First, illumination of an amplitude-modulated continuous-wave laser causes resonance within MEMS gyroscopes, and an attacker can inject an arbitrary signal by exploiting aliasing. Second, we discover a new type of light sensitivity in MEMS accelerometers, wherein illumination of an unmodulated continuous-wave laser causes a DC offset in the accelerometer's output, which can be leveraged to arbitrary signal injection. The proposed laser-based injection can penetrate a sensor fusion algorithm based on a Kalman filter that combines a gyroscope, an accelerometer, and a magnetometer. Finally, we verify that the first attack on MEMS gyroscopes is caused by mechanical vibration through a successful injection with off-chip laser illumination.

  • Takasuke NAGAI, Shoichiro TAKEDA, Shinya SHIMIZU, Hitoshi SESHIMO
    原稿種別: PAPER
    論文ID: 2025EDP7130
    発行日: 2025年
    [早期公開] 公開日: 2025/11/20
    ジャーナル フリー 早期公開

    We propose a novel training strategy for action quality assessment (AQA) that is designed to specifically assess action quality while ignoring scene context, which is unrelated to the action. Recent AQA models typically utilize three-dimensional (3D) convolutions to extract spatiotemporal features from videos. However, since these models are not explicitly designed to extract features relevant to the action, they may inadvertently extract scene context. To address this issue, we propose a training strategy that uses human-masked videos in which the action is masked, and trains the model to predict a fixed score of zero for these inputs. This strategy encourages the model to ignore scene context by making the score correlation between AQA model outputs and human judges undefinable when the action is not visible. Experimental results on two widely used AQA datasets show that our strategy improves AQA performance and effectively ignores scene context. We further investigate how the design of human-masked videos, specifically the shape and color of the mask, affects the model ability to ignore scene context.

  • Guangming ZHANG, Liangyou LU, Langtao HU
    原稿種別: PAPER
    論文ID: 2025EDP7155
    発行日: 2025年
    [早期公開] 公開日: 2025/11/20
    ジャーナル フリー 早期公開

    Hyperspectral image (HSI) denoising is a critical issue in the field of remote sensing. The combination of low-rank matrix/tensor factorization (LRMF/LRTF) and total variation (TV) regularization can achieve excellent denoising results at a relatively low computational cost. However, such methods typically adopt the first-order TV norm to linearly penalize gradients, leading to staircase artifacts and excessive smoothing of the texture details. Furthermore, most methods fail to model the pixel-wise variation differences, resulting in contrast loss of important structures. To address the aforementioned issues, this paper presents a novel LRMF-based method named Representative Coefficient Weighted Fractional-Order TV (RCWFOTV). Firstly, as a global structure, the low dimensionality allows denoising to be formulated exclusively on the representative coefficient U. Then, we replace the first-order TV with Grünwald-Letnikov fractionalorder TV (G-L FOTV) to model the local smoothness (LS) prior of U. By incorporating more proximity characteristics, G-L FOTV nonlinearly retains the low-frequency components and enhances the high-frequency ones, thereby avoiding the staircase artifacts and loss of detail. Finally, a weighted scheme is introduced to adaptively sparsify the gradient map of U, maintaining important texture structures. Extensive experiments on both synthetic and real noisy HSIs demonstrate that the proposed method outperforms the other state-of-the-art methods in terms of both performance and speed.

  • Takahiro KAWAJI
    原稿種別: PAPER
    論文ID: 2025DKP0010
    発行日: 2025年
    [早期公開] 公開日: 2025/11/18
    ジャーナル フリー 早期公開

    The present study systematically examined the effectiveness of the Idea-Marathon System (IMS) as a creativity training method using the S-A Creativity Test, which measures both divergent thinking traits and Creative Activity Areas. Previous research has focused mainly on divergent thinking; however, less is known about whether training effects extend to applied, context-sensitive domains. To address this, a quasi-experimental design was implemented with first-year undergraduates at A University (training group: n = 51; control group: n = 36). Over a 15-week intervention, the training group engaged in daily idea generation following IMS, while the control group received no training. Although statistical significance was not achieved, IMS showed tendencies toward improvements in productive improvement (Tb), imaginative speculation (Tc), fluency (F), flexibility (X), and elaboration (E), while originality (O) appeared to be maintained rather than enhanced. No effect was found for practical application (Ta). These findings suggest that IMS may provide a sustained and multi-contextual approach to creativity training, while also indicating that its potential benefits could depend on task-specific cognitive demands.

  • Lin Zhou, Zhen Liu, Zhonglin Ye, Yuzhi Xiao, Haixing Zhao, Jiaxin Han, ...
    原稿種別: PAPER
    論文ID: 2024EDP7153
    発行日: 2025年
    [早期公開] 公開日: 2025/11/12
    ジャーナル フリー 早期公開

    Graph neural networks have attracted widespread attention due to their powerful learning ability for graph structured data, and are often used to solve node classification tasks on graphs. However, the vast majority of models focus on considering the relationships between nodes and ignore the structural information of edges, resulting in insufficient extraction of graph structural features. In this paper, we propose Graph Mapping Relation-Aware Twin Neural Network (GMR-TNN). The model utilizes the twinning of graphs and line graphs to deepen learning, with graphs guiding line graphs to represent learning and line graphs augmenting the structural representation of graphs. Introducing the twin graph mapping, mining the structural relationship between graphs and line graphs to go for an effective combination ensures that their node features are accurately embedded in the low-dimensional space, providing a richer representation for the final node classification task. Experimental comparison on five datasets such as Cora, Chameleon, etc., the GMR-TNN model shows better results in the node classification task, which validates the full utilization and effectiveness of GMR-TNN on graph structure information.

  • Yuki HIROHASHI, Takayoshi YAMASHITA, Tsubasa HIRAKAWA, Hironobu FUJIYO ...
    原稿種別: PAPER
    論文ID: 2025EDP7083
    発行日: 2025年
    [早期公開] 公開日: 2025/11/12
    ジャーナル フリー 早期公開

    Prompt learning automates the manual crafting of prompts for adapting vision-and-language models, to downstream tasks, particularly in few-shot scenarios. This paper addresses two key challenges in prompt learning: limited performance in one-shot settings and inefficient dataset construction from unlabeled data. To tackle these challenges, we visualize and compare CLIP's feature spaces after prompt learning under one-shot and 16-shot conditions, identifying necessary characteristics of feature spaces that yield better prompts. We propose two novel loss functions—Inclusive Loss and Exclusive Loss—that enhance accuracy in one-shot scenarios by encouraging the feature space to resemble those trained with sufficient data. Additionally, we investigate the distribution of image features within CLIP's feature space and introduce a sampling method called Cluster-Centroid Sampling (CCS). CCS constructs a more category-balanced dataset by selecting samples closest to cluster centroids. To validate our approaches, we conducted extensive experiments. First, we demonstrate the effectiveness of our proposed loss functions across multiple datasets, showing accuracy improvements in one-shot conditions. Second, we evaluate CCS using an unlabeled data pool, confirming its superiority over existing sampling methods in downstream task accuracy due to the construction of more balanced dataset.

  • Yijia Zhang, Rui Shi, Juanjuan Li, Lu Xu, Lei Song
    原稿種別: PAPER
    論文ID: 2025EDP7007
    発行日: 2025年
    [早期公開] 公開日: 2025/11/07
    ジャーナル フリー 早期公開

    Automatic modulation recognition (AMR) plays a significant role in communication systems. Traditional AMR algorithms predominantly rely on either time-domain or frequency-domain signal information. However, relying solely on a single-domain analysis fails to capture the full range of the signal's time-varying and spectral characteristics, leading to inadequate representation of their multi-dimensional features. In this paper, we propose a novel modulation recognition architecture named Time-Frequency Multi-Modal Neural Network (TFMMN), which stands for Time-Frequency Multi-Modal Fusion. This architecture integrates traditional Convolutional Neural Network (CNN) within a Multi-channel Feature Extraction Module (MFEM) and incorporates Residual Multi-Head Self-Attention Mechanism (SA) to process signals across multiple modalities. By preprocessing the I/Q signals, we obtain amplitude and phase (A/P) signals with distinct characteristics and Fast Fourier Transform (FFT) signals. Under the feature signals of these three modalities, a multi-branch structure is constructed, and a multi-channel structure is utilized for complementary feature enhancement. We conducted experiments on the public dataset RadioML2016.10A, and the results show that our algorithm outperforms existing recognition algorithms in terms of classification accuracy. Specifically, for the challenging classification between 16QAM and 64QAM, the average classification accuracy of both modulation types exceeds 90% at a signal-to-noise ratio (SNR) of 0 dB.

  • Ryo SOGA, Takatomi KUBO, Takashi ISHIO, Yuna NUNOMURA, Takahiro KINOSH ...
    原稿種別: PAPER
    論文ID: 2025EDP7086
    発行日: 2025年
    [早期公開] 公開日: 2025/11/07
    ジャーナル フリー 早期公開

    The physiological states of software developers can impact their work performance. Previous research has indicated that physiological signals, such as heart rate, can be used to predict developers' work performance during tasks. However, conventional methods rely only on heart rate measured during tasks (peri-task), making it difficult to predict and proactively prevent poor work performance prior to beginning tasks. This study aims to enhance predictability by investigating whether heart rate measured before tasks (pre-task) is a valuable resource for predicting work performance. We conducted program comprehension tasks as the primary software development tasks and analyzed pre- and peri-task frequency-domain heart rate variability (HRV) metrics for various timeframes. As a result, we obtained two key findings: 1) Combining pre- and peri-task HRV metrics improved work performance prediction during tasks. 2) Work performance prediction using pre-task HRV metrics achieved comparable estimation performance to that using peri-task HRV metrics for predictions before tasks. These results suggest that, in addition to improving the conventional approaches, pre-task heart rate could also be used to establish a more proactive approach to reducing the risk of performance decline caused by fatigue or stress.

  • Toshio ITO
    原稿種別: PAPER
    論文ID: 2025EDP7129
    発行日: 2025年
    [早期公開] 公開日: 2025/11/07
    ジャーナル フリー 早期公開

    In typical Internet of Things (IoT) scenarios, devices with sensors and actuators connect to servers on cloud platforms over the Internet. To maintain the security of the whole system, the devices and servers need to be configured to securely communicate with each other. This configuration process is called onboarding. As an increasing number of IoT devices is deployed, the cost and time of onboarding become overwhelming. To solve this problem, we propose a semi-automated onboarding framework for IoT devices. Unlike other frameworks such as FIDO Device Onboard, the framework we developed does not require pre-registered device ownership. This simplifies the system because there is no requirement on the supply chain of devices. To determine the device owner, our framework uses OAuth 2.0 Device Authorization Grant. We evaluated the time needed to onboard devices in an experiment where human operators onboarded five devices with a prototype of the framework. The results indicated that the proposed framework was sufficiently fast for small-scale applications. We analyzed the security aspects of our framework based on the specifications and drafts of the OAuth 2.0 framework. We also analyzed an alternative method for Device Authorization Grant that uses FIDO2 standards. Based on the analysis, we evaluated the trade-off between security, simplicity, flexibility, and efficiency of the proposed onboarding framework.

  • Yukasa MURAKAMI, Yuriko TAKATSUKA, Masateru TSUNODA, Akito MONDEN
    原稿種別: LETTER
    論文ID: 2025MPL0001
    発行日: 2025年
    [早期公開] 公開日: 2025/11/07
    ジャーナル フリー 早期公開

    Collecting software-related data is essential to improve software quality and software development processes. To promote such data collection, we analyzed users' willingness to provide data, focusing on three types of data collection: software crash reports, user operation logs of applications, and engineers' operation logs in software development. We emphasized the importance of trust in data collectors, as higher trust is expected to lead to greater willingness to provide data. The effects of trust may vary depending on the type of data collection, as the perceived severity of data for users can also differ between types. Using questionnaires, we analyzed users' willingness to participate in data collection, and the results suggest that the impact of trust in data collectors varies across data types. It is particularly weak when collecting crash reports but becomes crucial for software development-related data.

  • Yueqi Zhou, Jialin Cui, Bin Lian, Hao Luo, Yangming Zheng, Zhe-Ming Lu
    原稿種別: LETTER
    論文ID: 2025EDL8047
    発行日: 2025年
    [早期公開] 公開日: 2025/11/04
    ジャーナル フリー 早期公開

    The automatic surface quality inspection of shock absorber connecting rods is crucial for ensuring vehicle safety and performance. This paper proposes an enhanced PatchCore algorithm for unsupervised anomaly detection, which adopts a multi-level feature processing and fusion strategy of hierarchical processing module (HPM) and adaptive feature fusion module (AFFM) to capture multi-scale anomalies, and uses an adaptive greedy coreset sampling method to improve local density estimation for subtle defect detection. The ablation study shows that our enhanced feature extraction framework improves spatial level performance, while the optimized sampling strategy enhances the accuracy of small anomaly detection. Experiments show that our method has superior performance in anomaly detection for shock absorber connecting rods.

  • Shuhei DENZUMI, Masaaki NISHINO, Norihito YASUDA
    原稿種別: PAPER
    論文ID: 2025EDP7062
    発行日: 2025年
    [早期公開] 公開日: 2025/10/31
    ジャーナル フリー 早期公開

    A family of sets is a collection where each element is a set, enabling the representation of many practical concepts. Various operations on families of sets are widely applied in fields such as databases and data mining. Since the size of set families in these applications often becomes exponentially large, we need sophisticated algorithms to manipulate them. Zero-suppressed decision diagrams (ZDDs) efficiently represent families of sets using directed acyclic graphs, supporting various operations known as family algebra. However, designing efficient algorithms for ZDDs demands expertise and is costly, underscoring the need for more accessible design methods.

    This paper introduces an algorithm template that extends ZDD-based family algebra. We can easily design new operations by setting component functions to the template. The template is a natural generalization of existing operations, reproducing them without loss of efficiency. Additionally, it enables the generation of previously impractical ZDD operations without deep knowledge of ZDDs. This paper also presents concrete examples of new operations.

  • Wenrui ZHU, Junqi YU, Tongtong WENG, Zhengwei SONG
    原稿種別: LETTER
    論文ID: 2025EDL8052
    発行日: 2025年
    [早期公開] 公開日: 2025/10/21
    ジャーナル フリー 早期公開

    As a downstream task of visual entity and relationship extraction, human-object interaction detection focuses on complex relationships centered around humans as the primary subject. This has significant potential for application in some labour-intensive industries such as construction engineering. However, the data in these contexts often display a long-tailed distribution, featuring numerous unknown entities and relationships that are not present in standard datasets. This phenomenon places considerable demands on the model's zero-shot learning capabilities. To tackle this challenge, this letter proposed an end-to-end human-object interaction detection method that utilized domain knowledge graph embeddings as part of prior queries for the decoders. In the case study, this method achieved a mean Average Precision (mAP) of 48.57% for the Full types across various scenarios. Specifically, the Rare types achieved a mAP of 52.45%, while the Non-Rare types achieved a mAP of 41.67%.

  • Shiyu YANG, Yusheng GUO, Akihiro TABATA, Yoshiki HIGO
    原稿種別: PAPER
    論文ID: 2025EDP7092
    発行日: 2025年
    [早期公開] 公開日: 2025/10/21
    ジャーナル フリー 早期公開

    As one of the most widely used programming languages in modern software development, Python hosts a vast open-source codebase on GitHub, where code reuse is widespread. This study leverages open-source Python projects from GitHub and applies automated testing to discover pairs of functionally equivalent methods. We collected and processed methods from 5,100 Python repositories, but Python's lack of static type checking presented unique challenges for grouping these methods. To address this, we conducted detailed type inference and organized methods based on their inferred types, providing a structured foundation for subsequent analysis. We then employed automated test generation to produce unit tests for each method, running them against one another within their respective groups to identify candidate pairs that yielded identical outputs from the same inputs. Through manual verification, we ultimately identified 68 functionally equivalent method pairs and 683 functionally non-equivalent pairs. These pairs were compiled into a comprehensive dataset, serving as the basis for further examination. With this dataset, we not only evaluated the ability of large language models (LLMs) to recognize functional equivalence, evaluating both their accuracy and the challenges posed by diverse implementations, but also conducted a systematic performance evaluation of equivalent methods, measuring execution times and analyzing the underlying causes of efficiency differences. The findings demonstrate the potential of LLMs to identify functionally equivalent methods and highlight areas requiring further advancement.

  • Shaojing ZHAO, Songchen FU, Letian BAI, Hong LIANG, Qingwei ZHAO, Ta L ...
    原稿種別: PAPER
    論文ID: 2025EDP7099
    発行日: 2025年
    [早期公開] 公開日: 2025/10/21
    ジャーナル フリー 早期公開

    Multi-objective reinforcement learning (MORL) addresses sequential decision-making problems involving conflicting objectives. While most existing methods assume access to known or explicitly defined utility functions, many real-world tasks feature implicit, nonlinear utilities that are only available as delayed black-box feedback. To tackle this challenge, we propose Adaptive Multi-Objective Actor-Critic (AMOAC), a scalable framework that dynamically aligns policy optimization with implicit utility signals, without requiring prior knowledge of the utility function's form. AMOAC employs a multi-critic architecture to maintain computational efficiency as the number of objectives grows, and introduces a dynamic direction-aligned weighting mechanism to guide policy updates toward utility maximization. Experiments on benchmark environments—including Deep Sea Treasure, Minecart, and Four Room—demonstrate that AMOAC consistently matches or exceeds the performance of baselines with explicit utility access, achieving robust adaptation and convergence under both linear and nonlinear utility scenarios. These results highlight the potential of dynamic weight adjustment in MORL for handling implicit preference structures and limited feedback settings.

  • Shibo ZHANG, Hongchang CHEN, Shuxin LIU, Ran LI, Junjie ZHANG, Yingle ...
    原稿種別: LETTER
    論文ID: 2025EDL8040
    発行日: 2025年
    [早期公開] 公開日: 2025/10/16
    ジャーナル フリー 早期公開

    The proliferation of fake accounts in social networks has prompted growing attention to the development of effective detection techniques for ensuring cyberspace security. These fake accounts frequently employ sophisticated camouflage strategies to evade detection, which compromises the reliability of local neighborhood information. We propose GRFA, a novel approach for fake account detection that incorporates similarity-based adaptive graph reconstruction. The framework introduces a reinforcement learning-based adaptive mechanism to construct similarity edges, which dynamically refines the graph structure to better capture global dependencies. These refined structures are then incorporated into a heterogeneous graph neural network with dual aggregation, significantly improving the detection of camouflaged accounts. Experimental results demonstrate that GRFA outperforms state-of-the-art methods across multiple real-world datasets.

  • Koki SUGIOKA, Sayaka KAMEI, Yasuhiko MORIMOTO
    原稿種別: PAPER
    論文ID: 2025EDP7055
    発行日: 2025年
    [早期公開] 公開日: 2025/10/15
    ジャーナル フリー 早期公開

    Recently, websites that enable users to share and search for cooking recipes have gained popularity. Each recipe typically includes various pieces of information, including a title, a list of ingredients, and detailed steps described in text and illustrated with photos. The estimated cooking time for each recipe is another valuable information when selecting a recipe. However, it can be difficult to accurately determine cooking time because it depends on various factors, such as heat level, ingredient quantity, and cooking skill level. Therefore, some recipes do not include information on cooking time. In this study, we consider the prediction of cooking time in general scenarios based on a list of ingredients and a textual description of each recipe's cooking process using BERT, a natural language processing model. To this end, we propose an additional pretraining method that assigns greater weight to words related to cooking time using a cooking ontology. Our experimental results show that our method outperforms a fine-tuned BERT model with additional pre-training using a commonly employed approach. Notably, words representing “Kitchen Tools” are particularly associated with cooking time.

  • Hongcui WANG, Li MA, Zezhong LI, Fuji REN
    原稿種別: LETTER
    論文ID: 2025EDL8032
    発行日: 2025年
    [早期公開] 公開日: 2025/10/07
    ジャーナル フリー 早期公開

    Ellipse detection plays a critical role in fields such as medical diagnosis, environmental monitoring, and industrial automation. However, traditional methods (e.g., Hough transform, least-squares fitting, and edge-following techniques) suffer from high computational complexity and poor noise robustness. To address these limitations, we propose a hybrid framework that integrates deep learning with geometric constraints. First, Faster RCNN is employed to localize axis-aligned bounding boxes (AABBs) of ellipses. Then, a point-pair filtering strategy extracts edge points satisfying predefined geometric constraints, followed by weighted least-squares fitting to estimate ellipse parameters. Compared with traditional approaches, our method directly identifies AABBs, significantly enhancing both the efficiency and accuracy of multi-target ellipse detection in practice. Experiments are conducted on two synthetic datasets. The results show that our proposed method achieves superior precision and F-measure compared to conventional ellipse detection algorithms.

  • Lenz NERIT, Youmei FAN, Benson MIROU, Kenichi MATSUMOTO, Raula GAIKOVI ...
    原稿種別: PAPER
    論文ID: 2025MPP0001
    発行日: 2025年
    [早期公開] 公開日: 2025/10/07
    ジャーナル フリー 早期公開

    Developers rely on third-party open-source libraries to save time and reuse well-tested code. As technology stacks diversify, libraries are deployed across multiple ecosystems to reach broader audiences and accommodate different user needs. However, maintainers may hesitate due to concerns about increased maintenance effort and uncertain adoption outcomes. This study investigates the impact of cross-ecosystem deployments on maintenance effort and project adoption. Analyzing 972,592 NPM and PyPI packages, we focused on 420 actively maintained libraries that exist in both ecosystems. Of these, 184 were initially deployed to NPM, 148 to PyPI, and 88 were synchronized releases. We collected GitHub metrics—including issues, pull requests, contributors, forks, and commits—over a three-month period before and after deployment. Results show that 80-85% of packages saw no major maintenance activity. However, synchronized releases led to a 15.91% rise in issue reporting and an 11.49% increase in pull requests (PyPI → NPM), indicating higher initial maintenance effort. Popularity remained stable for 87% of packages, though synchronized releases saw an 11.36% increase in forks. While contributions increased in some cases (e.g., 13.59% in NPM → PyPI), others saw a decline in commit activity. Overall, cross-ecosystem deployment does not significantly raise maintenance effort but also does not guarantee increased adoption. Our results show insights towards understanding how deploying to multiple ecosystems may have some benefits.

  • Yueyi YANG, Jinxia WEN, Haiquan WANG, Xiangzhou BU, Yabo HU
    原稿種別: PAPER
    論文ID: 2025EDP7156
    発行日: 2025年
    [早期公開] 公開日: 2025/10/03
    ジャーナル フリー 早期公開

    Human activity recognition (HAR) is necessary for detection of unsafe activity in industrial production, but there are still some issues that need to be solved, such as limited data in different scenarios and the lack of a unified model for different situations. Therefore, a novel meta-federated learning framework with distillation of activation boundaries (AB) is proposed, in which a federation is viewed as a meta-distribution and all federations work together without a central server. Specifically, the personalized model from the previous federation serves as the teacher model for the next federation, where general knowledge is extracted by AB knowledge distillation, the personalized knowledge is acquired through local training, and a high-quality model is obtained for the current federation by dynamically fusing general knowledge and personalized knowledge. To evaluate the effectiveness and superiority of the proposed framework, experiments were conducted on one popular HAR datasets (PAMAP2) and a chemical scenario dataset (WACID) constructed by our laboratory. The experimental results show that our proposed framework outperforms the state-of-the-art methods with fewer communication costs, achieving the recognition accuracies of 91.23% and 95.66% on the PAMAP2 dataset and WACID dataset, respectively.

  • Bubai MANNA, Cam LY NGUYEN, Bodhayan ROY, Vorapong SUPPAKITPAISARN
    原稿種別: PAPER
    論文ID: 2025FCP0007
    発行日: 2025年
    [早期公開] 公開日: 2025/10/03
    ジャーナル フリー 早期公開

    Several works recently focus on monitoring air quality of critical areas using sensors attached to buses. They aim to monitor the maximum number of critical areas using a limited number of sensors. In practice, we may want to have information for all critical areas. We work on the problem of covering all the areas using the minimum number of sensors in this work. We show that, even when the bus routes are not pre-defined, the problem is log-APX-hard and is significantly harder than the problem of the previous works. We develop two algorithms for the case that the routes are pre-defined. Those algorithms include a fixed parameter tractability and a greedy algorithm. Next, we show an NP-completeness reduction for a special case of the problem and propose a 2-approximation algorithm for it. Our experiment results show that, although we usually give the similar number of sensors as the algorithm in the previous works, our algorithms have a shorter computation time than the classical greedy algorithm.

  • Eunmin KIM, SungYoun JEONG, Jiwon SEO
    原稿種別: LETTER
    論文ID: 2025EDL8037
    発行日: 2025年
    [早期公開] 公開日: 2025/10/01
    ジャーナル フリー 早期公開

    The composition model in ROS 2 enables multiple nodes to run within a single process, reducing the overhead of inter-process communication (IPC). However, this architecture introduces memory safety and concurrency challenges due to a shared address space and a common Executor. Existing tracing tools lack the granularity to detect node-level runtime anomalies in such settings. We present COMPSHIELD, a system that extends ROS2Trace to enable intra-process node-level tracing and misbehavior detection. COMPSHIELD combines static analysis with enhanced runtime tracing to identify temporal anomalies and concurrency-related performance issues, such as Executor monopolization and prolonged callback execution. Evaluation on a ROS 2 composition application shows that COMPSHIELD effectively detects such issues with low overhead.

  • Zhengran HE, Mengyao XU, Kaifei ZHANG, Feng ZHOU, Cuangao TANG, Yuan Z ...
    原稿種別: LETTER
    論文ID: 2025EDL8030
    発行日: 2025年
    [早期公開] 公開日: 2025/09/29
    ジャーナル フリー 早期公開

    Unlike conventional speech-based depression detection (SDD), cross-elicitation SDD presents a more challenging task due to the differing speech elicitation conditions between the labeled source (training) and unlabeled target (testing) speech data. In such scenarios, a significant feature distribution gap may exist between the source and target speech samples, potentially reducing the detection performance of most existing SDD methods. To address this issue, we propose a novel deep transfer learning method called the Deep Elicitation-Adapted Neural Network (DEANN) in this letter. DEANN aims to directly learn both depression-discriminative and elicitation-invariant features from speech spectrograms corresponding to different elicitation conditions using two weight-shared Convolutional Neural Networks (CNNs). To achieve this, the CNNs are first endowed with depression-discriminative capability by establishing a relationship between the source speech samples and the provided depression labels. Subsequently, a well-designed constraint mechanism, termed Bidirectional Sparse Reconstruction, is introduced. This mechanism ensures that source and target speech samples can be sparsely reconstructed by each other at the same feature layer of both CNNs, allowing the learned features to maintain adaptability to changes in speech elicitation conditions while preserving their original depression-discriminative capability. To evaluate DEANN, we conduct extensive cross-elicitation SDD experiments on the MODMA dataset. The experimental results demonstrate the effectiveness and superiority of the proposed DEANN in addressing the challenge of cross-elicitation SDD compared to many existing state-of-the-art transfer learning methods.

feedback
Top