A family of sets is a collection where each element is a set, enabling the representation of many practical concepts. Various operations on families of sets are widely applied in fields such as databases and data mining. Since the size of set families in these applications often becomes exponentially large, we need sophisticated algorithms to manipulate them. Zero-suppressed decision diagrams (ZDDs) efficiently represent families of sets using directed acyclic graphs, supporting various operations known as family algebra. However, designing efficient algorithms for ZDDs demands expertise and is costly, underscoring the need for more accessible design methods. This paper introduces an algorithm template that extends ZDD-based family algebra. We can easily design new operations by setting component functions to the template. The template is a natural generalization of existing operations, reproducing them without loss of efficiency. Additionally, it enables the generation of previously impractical ZDD operations without deep knowledge of ZDDs. This paper also presents concrete examples of new operations.
The physiological states of software developers can impact their work performance. Previous research has indicated that physiological signals, such as heart rate, can be used to predict developers’ work performance during tasks. However, conventional methods rely only on heart rate measured during tasks (peri-task), making it difficult to predict and proactively prevent poor work performance prior to beginning tasks. This study aims to enhance predictability by investigating whether heart rate measured before tasks (pre-task) is a valuable resource for predicting work performance. We conducted program comprehension tasks as the primary software development tasks and analyzed pre- and peri-task frequency-domain heart rate variability (HRV) metrics for various timeframes. As a result, we obtained two key findings: 1) Combining pre- and peri-task HRV metrics improved work performance prediction during tasks. 2) Work performance prediction using pre-task HRV metrics achieved comparable estimation performance to that using peri-task HRV metrics for predictions before tasks. These results suggest that, in addition to improving the conventional approaches, pre-task heart rate could also be used to establish a more proactive approach to reducing the risk of performance decline caused by fatigue or stress.
In typical Internet of Things (IoT) scenarios, devices with sensors and actuators connect to servers on cloud platforms over the Internet. To maintain the security of the whole system, the devices and servers need to be configured to securely communicate with each other. This configuration process is called onboarding. As an increasing number of IoT devices is deployed, the cost and time of onboarding become overwhelming. To solve this problem, we propose a semi-automated onboarding framework for IoT devices. Unlike other frameworks such as FIDO Device Onboard, the framework we developed does not require pre-registered device ownership. This simplifies the system because there is no requirement on the supply chain of devices. To determine the device owner, our framework uses OAuth 2.0 Device Authorization Grant. We evaluated the time needed to onboard devices in an experiment where human operators onboarded five devices with a prototype of the framework. The results indicated that the proposed framework was sufficiently fast for small-scale applications. We analyzed the security aspects of our framework based on the specifications and drafts of the OAuth 2.0 framework. We also analyzed an alternative method for Device Authorization Grant that uses FIDO2 standards. Based on the analysis, we evaluated the trade-off between security, simplicity, flexibility, and efficiency of the proposed onboarding framework.
Automatic modulation recognition (AMR) plays a significant role in communication systems. Traditional AMR algorithms predominantly rely on either time-domain or frequency-domain signal information. However, relying solely on a single-domain analysis fails to capture the full range of the signal’s time-varying and spectral characteristics, leading to inadequate representation of their multi-dimensional features. In this paper, we propose a novel modulation recognition architecture named Time-Frequency Multi-Modal Neural Network (TFMMN), which stands for Time-Frequency Multi-Modal Fusion. This architecture integrates traditional Convolutional Neural Network (CNN) within a Multi-channel Feature Extraction Module (MFEM) and incorporates Residual Multi-Head Self-Attention Mechanism (SA) to process signals across multiple modalities. By preprocessing the I/Q signals, we obtain amplitude and phase (A/P) signals with distinct characteristics and Fast Fourier Transform (FFT) signals. Under the feature signals of these three modalities, a multi-branch structure is constructed, and a multi-channel structure is utilized for complementary feature enhancement. We conducted experiments on the public dataset RadioML2016.10A, and the results show that our algorithm outperforms existing recognition algorithms in terms of classification accuracy. Specifically, for the challenging classification between 16QAM and 64QAM, the average classification accuracy of both modulation types exceeds 90% at a signal-to-noise ratio (SNR) of 0 dB.
Graph neural networks have attracted widespread attention due to their powerful learning ability for graph structured data, and are often used to solve node classification tasks on graphs. However, the vast majority of models focus on considering the relationships between nodes and ignore the structural information of edges, resulting in insufficient extraction of graph structural features. In this paper, we propose Graph Mapping Relation-Aware Twin Neural Network (GMR-TNN). The model utilizes the twinning of graphs and line graphs to deepen learning, with graphs guiding line graphs to represent learning and line graphs augmenting the structural representation of graphs. Introducing the twin graph mapping, mining the structural relationship between graphs and line graphs to go for an effective combination ensures that their node features are accurately embedded in the low-dimensional space, providing a richer representation for the final node classification task. Experimental comparison on five datasets such as Cora, Chameleon, etc., the GMR-TNN model shows better results in the node classification task, which validates the full utilization and effectiveness of GMR-TNN on graph structure information.
Generative AI (GenAI) is increasingly being integrated into creative work, either as a collaborator or as a replacement for human creators. More previous work has focused on augmenting users’ creativity in the context of individual-GenAI collaboration. Humans often engage in group creative works across countless real-world contexts, yet the effects of GenAI on such group creativity remain largely unexplored, an urgent gap that demands immediate research attention. To address this gap as a first step, we conducted an electronic brainstorming experiment with three conditions in a within-subjects design: (A) groups of three participants without GenAI, (B) groups of three participants with GenAI, and (C) individual participants with GenAI (N = 24). In the results, GenAI-assisted group brainstorming significantly reduced the number of human-generated ideas, and did not significantly change the quality compared to brainstorming without GenAI. Plausible explanations for these are that reliance on GenAI is further increased in a group setting, and social loafing is more likely to occur. Therefore, we found that simply incorporating a GenAI agent does not necessarily lead to more effective human-GenAI co-creation in groups. On the other hand, compared to individual use of GenAI, originality, elaboration, and flexibility improved significantly, so GenAI-assisted group brainstorming may have useful aspects. Based on our findings, we discuss the design implications of the strategy for leveraging GenAI effectively, future ideation methods, and creativity support systems. In particular, we suggest two interventions: 1) interactive idea generation, where humans and GenAI take turns combining and improving each other’s ideas, or 2) reducing over-reliance on GenAI. Our paper contributes to this domain by investigating the effects of human-GenAI collaboration in groups on brainstorming and providing design implications for more effective co-creation.
The present study systematically examined the effectiveness of the Idea-Marathon System (IMS) as a creativity training method using the S-A Creativity Test, which measures both divergent thinking traits and Creative Activity Areas. Previous research has focused mainly on divergent thinking; however, less is known about whether training effects extend to applied, context-sensitive domains. To address this, a quasi-experimental design was implemented with first-year undergraduates at A University (training group: n = 51; control group: n = 36). Over a 15-week intervention, the training group engaged in daily idea generation following IMS, while the control group received no training. Although statistical significance was not achieved, IMS showed tendencies toward improvements in productive improvement (Tb), imaginative speculation (Tc), fluency (F), flexibility (X), and elaboration (E), while originality (O) appeared to be maintained rather than enhanced. No effect was found for practical application (Ta). These findings suggest that IMS may provide a sustained and multi-contextual approach to creativity training, while also indicating that its potential benefits could depend on task-specific cognitive demands.
Hyperspectral image (HSI) denoising is a critical issue in the field of remote sensing. The combination of low-rank matrix/tensor factorization (LRMF/LRTF) and total variation (TV) regularization can achieve excellent denoising results at a relatively low computational cost. However, such methods typically adopt the first-order TV norm to linearly penalize gradients, leading to staircase artifacts and excessive smoothing of the texture details. Furthermore, most methods fail to model the pixel-wise variation differences, resulting in contrast loss of important structures. To address the aforementioned issues, this paper presents a novel LRMF-based method named Representative Coefficient Weighted Fractional-Order TV (RCWFOTV). Firstly, as a global structure, the low dimensionality allows denoising to be formulated exclusively on the representative coefficient U. Then, we replace the first-order TV with Grünwald-Letnikov fractional-order TV (G-L FOTV) to model the local smoothness (LS) prior of U. By incorporating more proximity characteristics, G-L FOTV nonlinearly retains the low-frequency components and enhances the high-frequency ones, thereby avoiding the staircase artifacts and loss of detail. Finally, a weighted scheme is introduced to adaptively sparsify the gradient map of U, maintaining important texture structures. Extensive experiments on both synthetic and real noisy HSIs demonstrate that the proposed method outperforms the other state-of-the-art methods in terms of both performance and speed.
Prompt learning automates the manual crafting of prompts for adapting vision-and-language models, to downstream tasks, particularly in few-shot scenarios. This paper addresses two key challenges in prompt learning: limited performance in one-shot settings and inefficient dataset construction from unlabeled data. To tackle these challenges, we visualize and compare CLIP’s feature spaces after prompt learning under one-shot and 16-shot conditions, identifying necessary characteristics of feature spaces that yield better prompts. We propose two novel loss functions—Inclusive Loss and Exclusive Loss—that enhance accuracy in one-shot scenarios by encouraging the feature space to resemble those trained with sufficient data. Additionally, we investigate the distribution of image features within CLIP’s feature space and introduce a sampling method called Cluster-Centroid Sampling (CCS). CCS constructs a more category-balanced dataset by selecting samples closest to cluster centroids. To validate our approaches, we conducted extensive experiments. First, we demonstrate the effectiveness of our proposed loss functions across multiple datasets, showing accuracy improvements in one-shot conditions. Second, we evaluate CCS using an unlabeled data pool, confirming its superiority over existing sampling methods in downstream task accuracy due to the construction of more balanced dataset.
We propose a novel training strategy for action quality assessment (AQA) that is designed to specifically assess action quality while ignoring scene context, which is unrelated to the action. Recent AQA models typically utilize three-dimensional (3D) convolutions to extract spatiotemporal features from videos. However, since these models are not explicitly designed to extract features relevant to the action, they may inadvertently extract scene context. To address this issue, we propose a training strategy that uses human-masked videos in which the action is masked, and trains the model to predict a fixed score of zero for these inputs. This strategy encourages the model to ignore scene context by making the score correlation between AQA model outputs and human judges undefinable when the action is not visible. Experimental results on two widely used AQA datasets show that our strategy improves AQA performance and effectively ignores scene context. We further investigate how the design of human-masked videos, specifically the shape and color of the mask, affects the model ability to ignore scene context.
Large Language Models (LLMs) are increasingly used for the critical task of generating AI risk scenarios, yet practitioners lack empirical guidance on model selection. This study addresses that gap through a case study benchmarking 23 LLMs against a real-world AI system to analyze their underlying reasoning patterns. We introduce a novel “Hit Rate” metric based on actual incidents to quantitatively measure performance. The results suggest significant, statistically-verified performance disparities among models and show that this gap is uncorrelated with superficial linguistic fluency. Instead, we indicate that the performance gap appears to be strongly linked to the model’s underlying reasoning pattern, which leaves an unmistakable qualitative signature on the final outputs. A “Systematic Top-Down” approach, which mirrors expert human analysis, consistently produces specific and actionable scenarios, while less structured methods yield generic or contextually flawed warnings. These findings serve as a strong caution against model-agnosticism, establishing that an LLM’s reasoning process—suggested by the specificity and actionability of its outputs—is a critical factor for its efficacy in safety-critical tasks.
Accurately controlling the output length of large language models (LLMs) remains a non-trivial challenge, with many existing approaches exhibiting limited reliability or incurring additional architectural and inference-time costs. Failure to adhere to user-specified length constraints in real-world applications, such as news summarization and dialog systems, significantly degrades system reliability. This paper addresses this gap by applying Group Relative Policy Optimization (GRPO)—a stable, value-function-free reinforcement learning algorithm—to efficiently fine-tune LLMs for prompt-based length control without any architectural modification. We systematically compare four reward functions: a simple binary threshold (BLTR), a linear deviation penalty (PLR), and two novel proximity-aware variants with linear (LLPR) and exponential (ELPR) decay, designed to incentivize not just constraint satisfaction but also proximity to the target length. Experiments on CNNDM (English) and XL-Sum (Japanese) datasets with 1-billion-parameter models show that our GRPO-based approach dramatically improves length adherence. On Llama-3.2-1B-Instruct, the saturating PLR reward achieved the highest binary adherence (BLTR: 0.705), but our proximity-aware ELPR achieved strong adherence (0.612) while dramatically improving target proximity (LLPR score: -24.994 to -2.293). Notably, on Gemma-3-1b-it, ELPR consistently outperformed PLR on all metrics. Our analysis suggests that ELPR offers a strong balance of stability and performance. The results indicate that continuous, proximity-aware rewards may be more effective than simple binary signals for achieving robust and practical length control, highlighting a promising direction for future reward design.
The automatic surface quality inspection of shock absorber connecting rods is crucial for ensuring vehicle safety and performance. This paper proposes an enhanced PatchCore algorithm for unsupervised anomaly detection, which adopts a multi-level feature processing and fusion strategy of hierarchical processing module (HPM) and adaptive feature fusion module (AFFM) to capture multi-scale anomalies, and uses an adaptive greedy coreset sampling method to improve local density estimation for subtle defect detection. The ablation study shows that our enhanced feature extraction framework improves spatial level performance, while the optimized sampling strategy enhances the accuracy of small anomaly detection. Experiments show that our method has superior performance in anomaly detection for shock absorber connecting rods.