-
Melvin Charles Ortua DY
Session ID: 3K6-IS-2c-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Dynamic ads that respond to search inputs and automatically combine text assets to maximize performance are now commonplace. In addition to extant needs in traditional ad creation, automated generation of text assets can also greatly benefit from having some foreknowledge of how outputs might perform. This paper describes the development of such a performance prediction model, including an application of Kolmogorov-Arnold Networks; the best model overall achieved Spearman's rank correlation coefficient of 0.41 on the validation dataset using asset texts alone.
View full abstract
-
Jianshi WANG, Yukio OHSAWA
Session ID: 3K6-IS-2c-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper introduces the Semantic Logic Field (SLF), a novel framework for semantic modeling and dimensionality reduction. Inspired by field theory, SLF integrates topological principles with dynamic semantic relationships to bridge the gap between discrete semantic features and continuous transformations. The objective is to address the limitations of traditional methods (e.g., PCA, t-SNE, UMAP) in preserving global semantic structures during dimensionality reduction. SLF quantifies logical features of text, such as topic consistency, semantic coherence, and concept relevance, by modeling semantics as a dynamic field. Experiments on the 20 Newsgroups dataset demonstrate SLF's superior performance in clustering and dimensionality reduction. SLF achieves a Silhouette score of 0.9797 and distance preservation of 0.9994, outperforming traditional methods. The results show that SLF effectively preserves both local and global semantic structures, making it ideal for tasks like text clustering, semantic search, and cross-domain adaptation. In conclusion, SLF provides a robust and interpretable framework for semantic analysis, with potential applications in natural language processing and machine learning. Future work will focus on optimizing SLF's parameters, improving scalability, and integrating it with deep learning architectures.
View full abstract
-
Yanni REN, Tenta SASAYA, Takahiro TAKIMOTO
Session ID: 3K6-IS-2c-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
The performance of machine learning classification tasks heavily depends on both the availability and quality of labeled data. This challenge is particularly significant in industrial fields dominated by non-natural images due to costly expert annotation. Moreover, non-natural image datasets often exhibit severe class-wise imbalance. Active learning (AL) has emerged to improve labeled data quality by selecting informative samples within the same annotation budget. While some methods address imbalance, they fall short in scenarios with limited annotation budget, leaving a critical gap in tackling both limited availability and imbalance of labeled data for practical industrial applications. To tackle this issue, we proposed a dynamic selection mechanism based on a class-aware ranking strategy within the AL process, aiming to select more balanced labeled data under limited annotation budget. Furthermore, we incorporated this mechanism into a two-stage semi-supervised learning (SSL) framework. We evaluated the proposed method on two non-natural image datasets for image classification tasks. Results showed that our method achieved the highest F1 scores, effectively addressing both imbalance and limited labeled data challenges in non-natural image classification.
View full abstract
-
Improving interoperability of Clinical Decision Support System with Information Extraction and Semantic Search through Generative AI
Yasuhiko MIYACHI, Osamu ISHII, Keijiro TORIGOE
Session ID: 3L1-GS-10-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Backgrounds: Clinical Decision Support systems (CDSS) are useful for improved diagnostic quality. However, their operation has issues (pitfalls), such as fragmented workflows and a lack of interoperability. Objectives: This study proposes an improved method to overcome these issues. The proposed methods are 1) Information Extraction using Natural Language Processing, 2) Semantic search for medical coding, and 3) EHR-CDSS real-time interoperability using HL7 FHIR. Method: Information extraction and semantic search use Google's Public Cloud Services. Results and Discussion: The information extraction capability is comparable to experienced clinicians. The coding performance by semantic search is sufficiently practical for supporting the input of information such as symptoms with the granularity required by CDSS. Conclusion: This study has shown that information extraction, semantic search, and EHR-CDSS interoperability using HL7 FHIR are useful for improving CDSS usability. This method can also be applied to other CDSSs, making it easy to collaborate with various EHRs.
View full abstract
-
Hiroki MATSUZAKI, Ryo ITOH, Arief Fauzi AHMAD, Masahiro TOHMARU, Takay ...
Session ID: 3L1-GS-10-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Takato ARAKI, Hiroyuki KITAJIMA
Session ID: 3L1-GS-10-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Shota IWASAKI, Hiroyuki KITAJIMA
Session ID: 3L1-GS-10-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Utoku KAKIYAMA, Kazunari HENMI, Kouhei MIYATA, Yosihito INOUE, Motoki ...
Session ID: 3L1-GS-10-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
The recent advancements in deep learning have given rise to a range of medical applications, including those that support diagnosis. One such application is the use of hysteroscopic images for diagnostic purposes. While high classification accuracy has been achieved, the robustness of these models against domain shift remains uncertain, posing a challenge for clinical implementation. This study focuses on chronic endometritis, a condition of persistent endometrial inflammation. We propose CLIP-MLP, a method that first predicts lesion areas in hysteroscopic images using a deep learning model. Then, a multimodal model classifies the condition by integrating the original image with explanatory texts generated from the predictions. Experimental results demonstrate that CLIP-MLP outperforms image-only models in classifying unseen datasets, improving generalization and robustness against domain shift. This approach enhances the reliability of deep learning-based hysteroscopic diagnosis, facilitating its clinical adoption.
View full abstract
-
Taihei FUNAKI, Mami HORIOKA, Takehide SOH
Session ID: 3L4-GS-1-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Hajime HIRONAKA, Kei KIMURA, Akira SUZUKI, Makoto YOKOO
Session ID: 3L4-GS-1-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
This study considers combinatorial reconfiguration in constraint satisfaction on a 3-element set. Combinatorial reconfiguration is a problem of determining whether it is possible to reconfigure one solution to another by transforming step by step and maintaining intermediate solutions are also feasible. On a 2-element set, Gopalan et al. and Schwerdtfeger proved that the reconfiguration problem is solvable in polynomial time if the solution space of the constraint satisfaction problem is majority-closed. This study extends these results to constraint satisfaction on a 3-element set. In this case, the majority operation is not uniquely defined and returns an arbitrary value if all three arguments are different, but we show that the reconfiguration problem is solvable in polynomial time if the solution space is closed by some majority operation.
View full abstract
-
Eiji KAWASE, Hideaki TAMAI
Session ID: 3L4-GS-1-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Simulated Annealing (SA) is a metaheuristic algorithm applicable to various combinatorial optimization problems. SA begins its search from an arbitrary initial state and gradually changes the state. If the changed state is better than the current state, it is adopted; if it is worse, the changes state is accepted probabilistically. This process is repeated for a certain period, after which the final state is determined. However, SA's performance heavily depends on parameters especially the temperature, requiring trial-and-error adjustments. To enhance SA's performance, numerous improvement methods have been proposed. One such method is called Temperature Parallel Simulated Annealing (TPSA). TPSA runs different initial state at various temperatures, exchanging state at specific intervals. This paper reports the results of evaluation experiments based on the initial state given to TPSA.
View full abstract
-
Keisuke ONOUE, Ryosuke KOJIMA
Session ID: 3L4-GS-1-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Black-box optimization (BBO) is a framework for searching the optimal solution using only the input-output information of the objective function and is applicable to various scenarios, especially, the case where gradient information is not available. Among BBO methods, sequential model-based optimization (SMBO) is a method which aims for high sample efficiency by combining approximation of the objective function with a surrogate model and decision-making strategies that balance exploration and utilization. Although the objective function is a black box, depending on the application, constraints such as known relationships between input variables may be available as prior knowledge. Using this information can lead to more efficient optimization. In this study, we propose an SMBO method that efficiently handles constraints on objective variables in a discrete search space. The proposed method uses the Tensor-Train (TT) decomposition as a surrogate model and incorporates constraints by adding a penalty term to the loss function of TT decomposition. Numerical experiments show that the proposed method outperforms the conventional discrete BBO method in terms of sample efficiency, confirming the effectiveness of using prior knowledge.
View full abstract
-
Basic mathematics for planar perspective projection and orthogonal transformation in four dimensional space
SHOHEI HIDAKA, Takuma TORII, Kohske TAKAHASHI
Session ID: 3L5-GS-1-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Can we intuitively perceive and manipulate some object in higher than three dimensional space? The present study aims to build a theoretical framework to explore and validate human’s capability to perceive an object in four dimensional space. In machine learning, multivariate analysis, and other applications, orthogonal matrices are used to transform objects in vector space by keeping their metric relationship invariant. Four or higher dimensional space, any orthogonal matrix has some component visible and invisible through any two dimensional projection. This study proposes a systematic method to naturally visualize four dimensional objects by decomposing any orthogonal matrix into the visible invisible components.
View full abstract
-
TENDA OKIMOTO, Shunsuke YAMAOKA
Session ID: 3L5-GS-1-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
How to schedule a limited number of firefighters at a fire department that operates 24 hours a day is a critical issue for conducting firefighting, emergency, and rescue operations safely and effectively. The firefighter Scheduling Problem (FSP) is one of the application problems in staff scheduling, which has been extensively studied in operations research and artificial intelligence. FSP is a combinatorial constraint optimization problem where the objective is to find an assignment that satisfies all hard constraints and minimizing the sum of all violated soft constraints. The satisfaction level of firefighters is a crucial criterion for enhancing their working environment. In this paper, a formal framework for the FSP is defined. The experiments involve formulating the FSP as a 0-1 integer programming problem, which is solved using real data from a fire department in Hyogo Prefecture. Two solutions are presented: optimal and egalitarian solutions, which are then compared to an actual schedule.
View full abstract
-
Ikuha NISHITANI, Kotaro MINEYUKI, Tenda OKIMOTO, Hiroki SAKAI, Jun MIZ ...
Session ID: 3L5-GS-1-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Airline Crew Scheduling Problem (ACSP) is one of the widely investigated application problems in artificial intelligence and operations research. This problem consists of two sub-problems, namely Crew Pairing Problem (CPP) and Crew Assignment Problem (CAP). The objective of CPP is to generate feasible flight-crew pairings that covers all scheduled flights. The aim of CAP is actually assigning a crew to each generated pairing. The satisfaction level of crews is one of the important criterion for improving the working environment. In this paper, the focus is laid on the Satisfaction Level based Crew Assignment Problem (CAP^SL). The formal framework is defined by using the multi-objective constraint optimization problem. Furthermore, the aggregate objective function is defined by applying the conjoint analysis. In the experiments, the CPP is solved by using the real data of JAL. Next, the CAP^SL is formulated as a mono-objective 0-1 integer programming problem, and the optimal solution is solved.
View full abstract
-
Kotaro MINEYUKI, Ikuha NISHITANI, Tenda OKIMOTO, Hiroki SAKAI, Jun MIZ ...
Session ID: 3L5-GS-1-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Airline Crew Scheduling Problem (ACSP) is a combinatorial optimization problem that creates a work schedule that satisfies given constraints for a set of crews, work days, and work contents. ACSP consists of two sub-problems, namely Crew Pairing Problem (CPP) and Crew Assignment Problem (CAP). The objective of the former problem is to generate feasible flight-crew pairings that covers all scheduled flights. The aim of the latter problem is actually assigning a crew to each generated pairing. In the aviation industry, the education and training of new pilots is essential to maintaining the provision of safe and secure flight services. In previous works on ACSP, there exists few works on considering the newcomer education. In this paper, the focus is laid on the Education and Training of new pilots in CAP. The crew assignment problem considering the newcomer education (CAP^ED) is defined. In the experiments, the CAP^ED is formalized as 0-1 integer programming problem by using the real flight data of JAL, and the optimal solution is provided.
View full abstract
-
Tomoki KUBO, Yusuke IIDA
Session ID: 3L5-GS-1-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
“Epoch-wise Double Descent” refers to the phenomenon where test loss decreases again after overfitting in training with label noise. Traditional bias-variance trade-off theory cannot explain this phenomenon. In this study, we analyzed learning curves separated into the data with clean and noisy labels to understand the phenomenon further. We conducted numerical experiments with a 7-layer MLP using the CIFAR-10 data set with 30% label noise. The training process is visualized by separating the training loss into three elements: clean label data, noisy label data, and noisy label data evaluated with original labels. Our results reveal that the training process proceeds in three phases until the double descent occurs: (1) learning only clean label data, (2) learning data with noise labels causing test loss to increase, and (3) fitting the noisy labels perfectly, which leads to test loss decreasing and the double descent phenomena. These findings suggest that the double descent phenomenon arises from the model's over-fitting to noisy label data, which enhances the generalization of the model prediction again.
View full abstract
-
Koshiro AOKI, Ryota TAKATSUKI, Gouki MINEGISHI
Session ID: 3L6-OS-32-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Mechanistic Interpretability (MI) is an emerging field that aims to uncover the internal mechanisms of AI systems, particularly deep neural networks. MI seeks to identify not only input-output relationships but also the causal structures within models. With the development of large language models, MI has attracted growing attention from the perspectives of AI safety and reliability. However, this field’s rapid growth has led researchers to adopt disparate concepts and methods, resulting in a lack of a unified framework. Moreover, the precise meaning of "mechanistic" remains ambiguous, and the distinction between MI and existing interpretability methods has yet to be clearly established. In this paper, we first survey the historical and cultural background of MI, clarify its differences from traditional interpretability approaches, and propose a conceptual framework that organizes key ideas in MI. Additionally, we discuss MI methods and their limitations, ranging from observational to interventional approaches. Finally, we explore current challenges in MI research and offer directions for future work to understand increasingly complex AI systems and ensure their safety.
View full abstract
-
Gouki MINEGISHI, Hiroki FURUTA, Shohei TANIGUCHI, Yusuke IWASAWA, Yuta ...
Session ID: 3L6-OS-32-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Transformer-based language models exhibit In-Context Learning (ICL), where predictions are made adaptively based on context. While prior work links induction heads to ICL through phase transitions, this can only account for ICL when the answer is included within the context. However, an important property of practical ICL in large language models is the ability to meta-learn how to solve tasks from context, rather than just copying answers from context; how such an ability is obtained during training is largely unexplored. In this paper, we experimentally clarify how such meta-learning ability is acquired by analyzing the dynamics of the model’s circuit during training. Specifically, we extend the copy task from previous research into an In-Context Meta Learning setting, where models must infer a task from examples to answer queries. Interestingly, in this setting, we find that there are multiple phases in the process of acquiring such abilities, and that a unique circuit emerges in each phase, contrasting with the single-phase transition in induction heads. The emergence of such circuits can be related to several phenomena known in large language models, and our analysis lead to a deeper understanding of the source of the transformer’s ICL ability.
View full abstract
-
Hiroto OTAKE, Hiroki OUCHI, Shintaro OZAKI, Tatsuya HIRAOKA, Taro WATA ...
Session ID: 3L6-OS-32-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Large language models (LLMs) have demonstrated the ability to solve tasks in geographic domains, and it has been suggested that these capabilities rely on an internal geospatial world model. However, previous studies have mainly examined such representations using only a small number of the models trained on English-centric data, leaving it unclear how geospatial representations emerge in some models trained on other languages. In this study, we investigate the internal geographic representations of multiple regions in models pre-trained on data in different languages. Our experimental results indicate that the properties of these world models may strongly depend on the language used during training.
View full abstract
-
Ryota TAKATSUKI, Sonia JOSEPH, Ippei FUJISAWA, Ryota KANAI
Session ID: 3L6-OS-32-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
The formation of illusory contours has been associated with predictive processing, yet its detailed mechanism remain unclear. In this study, we show that the Kanizsa illusion can also be observed in Vision Transformers, a class of feedforward neural networks, thereby challenging the conventional understanding of their formation. To elucidate the underlying mechanism, we introduce a novel mechanistic interpretability method leveraging a diffusion model to track how predictions evolve across transformer layers. Finally, We discuss the universality of mechanisms between models and biological systems and the potential of our approach to contribute to a deeper understanding of the illusory contour formation in biological systems.
View full abstract
-
Tota ABE, Namgi HAN, Yusuke MIYAO
Session ID: 3L6-OS-32-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
This research investigates how multilingual Large Language Models (LLMs) encode gender biases in English and Japanese. It is plausible that gender biases appear differently according to the language in which we train LLMs. However, it remains to be discovered how multilingual LLMs learn and encode gender biases for different languages. We extract gender bias features for multiple languages using Sparse Auto-Encoders (SAEs) and see if the features are identical among languages. More specifically, we give multilingual LLMs gender-stereotypical and anti-gender-stereotypical texts. We extract interpretable features from neurons in the inner layers of LLMs using SAEs and look for the features that fire differently between the two texts. Then, we compare the feature activations between the English and Japanese cases. The experimental results indicate that gender bias is encoded in the distinct parts of multilingual LLMs according to the languages.
View full abstract
-
Yutaka HOSHINA, Shigeaki UEMURA, Satoshi NAKAMURA, Kazuyuki IIO
Session ID: 3M1-GS-10-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper focuses on a wire-tracking for all component wires in a highly bent electric cable, which is necessary for discussing the characteristics of cable products in actual use cases. Unique pre-processing for U-Net segmentation have been developed for this task. The pre-processing corresponds to the coordinate transformation process which converts bent cables to virtual straight cables and enables us to analyze highly bent electric cables using the conventional wire-tracking process for straight cables. By combining this pre-processing and U-Net, it becomes possible to detect shapes of all component wires in highly bent electric cables. The information of these wire shapes greatly enhances our understanding of the phenomena related to cable bending and our design process of cable products with better bending properties.
View full abstract
-
Takumi ODA, Hisashi SAITO, Yoshihito YAMAMOTO, Jun SONODA
Session ID: 3M1-GS-10-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Kengo HOI, Soichiro YOKOYAMA, Tomohisa YAMASHITA, Hidenori KAWAMURA, H ...
Session ID: 3M1-GS-10-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, research on end-to-end autonomous driving systems has become active; however, high-precision inference requires high-performance machines for both training and deployment, which imposes significant burdens on research and commercialization. If stereo cameras are employed, then not only can depth estimation be omitted through algorithmic distance measurement, but also, by acquiring both image and distance information with a single sensor—potentially eliminating the need for sensor fusion—the model is expected to be simplified. In this study, as an initial investigation into an end-to-end autonomous driving model using stereo cameras, we develop a model for adaptive cruise control, a fundamental autonomous driving task. We modified the Transfuser model to predict vehicle acceleration from stereo camera data. Evaluation on roughly 160,000 frames from Japanese public roads showed high accuracy in monotonous environments with minimal acceleration or deceleration, suggesting the practical viability of this approach. Conversely, in conditions of significant acceleration/deceleration, uphill driving, or abrupt lighting changes, accuracy decreased; we analyzed the causes and proposed improvements.
View full abstract
-
Kazuma KOMODA, Ping JIANG, Haifeng HAN, Junichiro OOGA
Session ID: 3M1-GS-10-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
In logistics warehouses, deep learning is used to automate picking operations, addressing the growing e-commerce market and declining labor force. Enhancing picking capabilities requires high-performance recognition of various items. However, deep learning models often degrade in performance when recognizing unknown items, necessitating additional learning with extensive training data. Few-shot learning, which reduces the amount of training data, struggles with recognizing parts of complex-shaped items and has low performance when objects are partially occluded. This paper proposes a few-shot learning framework to solve these issues. By calculating a diversity score for each unknown image and determining the appropriate number of images per class, it becomes possible to learn from low-performance images. Combining data augmentation
View full abstract
-
Hikaru HANEISHI, Yoshihito YAMAMOTO
Session ID: 3M1-GS-10-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Materials Informatics (MI) is attracting attention as a method to accelerate the design and discovery of new materials by utilizing various material databases and machine learning. In particular, it enables prediction of material properties and efficient search for candidate materials, and is being applied in a wide range of fields such as energy and electronic devices. On the other hand, in the fields of civil engineering, construction, and machinery, mechanical properties such as elastic modulus and corrosion resistance are required as material properties, but material search techniques targeting these properties have not been fully established. In this study, we attempted to apply several existing graph neural network (GNN) models that have been developed for predicting the band gap of crystal structures, especially the volume elastic modulus. The validation results show that all the models have high prediction performance within the scope of this study.
View full abstract
-
Rika TARUMI, Asahi HENTONA, Takayuki ITOH
Session ID: 3M4-OS-7a-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Chisa MORI, Masaki ONISHI, Takayuki ITOH
Session ID: 3M4-OS-7a-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Pedestrian flow simulation is a technique to reproduce pedestrian flow and is used to develop pedestrian flow guidance plans. In pedestrian flow simulation, there are induction parameters that determine the induction method and evaluation indicators that evaluate the induction results. In order to develop an effective pedestrian flow guidance plan, it is important to understand the relationship between the guidance parameters and evaluation indicators. However, multiple evaluation indicators exist, and their relationships are very complex, making interpretation difficult. In this study, we propose a method to interpret the relationship between induction parameters and evaluation indicators while considering multiple evaluation indicators.In the proposed method, multiple evaluation indices are drawn on a two-dimensional plane by dimensionality reduction using UMAP.The plane is divided into grids to visualize the tradeoffs for each grid, and a single evaluation indicator is calculated from the multiple evaluation indicators based on the user's preferred tradeoffs.The PCP is drawn using this single evaluation index.In this paper, we present an example of visualization of a dataset of simulated human flow data for evacuation guidance.Experiments show that it is possible to analyze the relationship between parameters and evaluation indicators for various trade-offs.
View full abstract
-
Kyoka OKI, Masaki ONISHI, Takayuki ITOH
Session ID: 3M4-OS-7a-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Ryoko ODA, Eita NAKAMURA, Takayuki ITOH
Session ID: 3M4-OS-7a-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, researchers have actively analyzed Western paintings using information science techniques, mostly focusing on major stylistic changes rather than influence among individual painters. In this study, building on Nakamura et al.’s method, we estimated influence relationships among Western painters using multiple image features and constructed a network. We then used a previously developed visualization system to compare how different features affect network accuracy. Specifically, we employed both color features and local features (capturing small-scale color and shading variations) to estimate each painter’s “parent node” (the artist who influenced them) and compared the results with historically recognized relationships on WikiArt.org. When estimating only one parent, color features more accurately reproduced the WikiArt data; however, when estimating up to 50 parents, local features performed better. We also found that increasing the dimensionality of local features further improved accuracy. Our findings highlight how the choice and combination of features influence painter networks and may advance future image analysis methods in Western art research.
View full abstract
-
Maki FURUE, Masakazu HIROKAWA, Keita SAKUMA, Ryuta MATSUNO, Takayuki I ...
Session ID: 3M4-OS-7a-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Ken WAKITA, Masahiko ITOH, Ryosuke SAGA
Session ID: 3M5-OS-7b-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Hinata MATSUMOTO, Ken WAKITA
Session ID: 3M5-OS-7b-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Fuya asai ASAI, Ken WAKITA
Session ID: 3M5-OS-7b-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Kohei ARIMOTO, Masahiko ITOH
Session ID: 3M5-OS-7b-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Yuta SAKAI, Kenta MIKAWA, Masayuki GOTO
Session ID: 3M5-OS-7b-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
With the advancement of information technology, data utilization has expanded, and multi-class classification tasks like image classification have become crucial. While machine learning models have achieved high accuracy, their opacity poses challenges, spurring the development of explainable AI (XAI). Current XAI methods, such as heatmaps highlighting influential input features or techniques quantifying feature importance for local explanations, primarily interpret input-output relationships. However, they fail to elucidate the structural relationships between multiple classes and provide limited global interpretability, often restricted to identifying predictive features. This study proposes a novel XAI approach leveraging the ECOC method to interpret category groupings that enhance model identification in multi-class classification. By decomposing the problem into multiple classification tasks, this approach offers insights into the ease of classification and the similarities among categories, advancing the interpretability of machine learning models.
View full abstract
-
ZHICHENG HUO, Takayuki ITOH
Session ID: 3M6-OS-7c-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
He HAN, Takayuki ITOH
Session ID: 3M6-OS-7c-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Minami KOJIMA, Ito TAKAYUKI
Session ID: 3M6-OS-7c-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Mizuki EBINA, Takayuki ITOH
Session ID: 3M6-OS-7c-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Quantitative Text Analysis of Press Releases Using Structural Topic Modeling
Chika EZURE
Session ID: 3M6-OS-7c-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
This study analyzes press releases related to Femtech using structural topic modeling to examine how the market has shifted from consumer-oriented products to tools for corporate productivity enhancement. The findings indicate that Femtech has expanded beyond its initial definition as "technology addressing women's health" and has been integrated into corporate strategies such as workplace health management and employee benefits.Previous research has primarily focused on specific products, such as menstrual tracking apps, highlighting their role in promoting self-management. However, this study suggests that structural changes in the market itself have reinforced the individualization of women’s health issues. Additionally, the disappearance of narratives related to sex education as Femtech becomes increasingly marketized further illustrates the impact of this transition.Through this analysis, this study emphasizes the importance of an inductive approach to understanding market-wide transformations, given the ambiguous definition of Femtech and its evolving role in shaping the discourse on women’s health.
View full abstract
-
Haruki NAGAMI, Kosuke SAKURAI, Ayako YAMAGIWA, Masayuki GOTO
Session ID: 3N1-GS-7-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Zero-shot segmentation is an image segmentation task that also detects unlearned objects. Segement Anything Model (SAM), that is a representative zero-shot segmentation model is capable of outputting highly accurate pixel masks of unlearned objects indicated by prompts, such as points, that are specified as objects of interest. RobustSAM is a method that adapts SAM to degraded images by incorporating a mechanism to remove noise in SAM. On the other hand, the quality near the boundary of a clear image is lower than that of a SAM, because the degradation information removal mechanism removes the boundary information at the same time. Therefore, this study proposes a novel segmentation model that can flexibly consider embeddings before removing degraded information by adding a residual connection mechanism using a weighted average to RobustSAM. Our proposed method enables us to improve the boundary quality of clear images. Furthermore, through experiments on real data, we show that the proposed method improves the accuracy for clear images while maintaining the accuracy for degraded images.
View full abstract
-
Kosuke SAKURAI, Ryotaro SHIMIZU, Masayuki GOTO
Session ID: 3N1-GS-7-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Tomoya SUGIHARA, Shuntaro MASUDA, Ling XIAO, Toshihiko YAMASAKI
Session ID: 3N1-GS-7-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Conventional supervised video summarization methods aggregate annotations from multiple annotators to create ground truth labels for model training. However, this approach often introduces noisy ground truth labels due to the association of multiple labels with a single video, potentially degrading model performance. Additionally, the small datasets further increase the risk of overfitting to specific categories. In contrast, large language models (LLMs) have recently demonstrated remarkable few-shot reasoning capabilities. These capabilities allow them to adapt to tasks with only a few task examples provided as prompts. Building on this, we propose a novel few-shot video summarization method. This method leverages the few-shot reasoning capabilities of LLMs to learn annotator-specific summarization tendencies from limited labeled data. Specifically, we utilize a pre-trained image captioning model to transform videos into textual data. The generated captions are paired with corresponding annotated labels to construct few-shot prompts. Using these few-shot prompts, the LLM performs frame-level scoring without requiring parameter updates. Experimental evaluations on the SumMe and TVSum datasets show that the proposed method outperforms random scoring method in F-score. These results highlight the effectiveness of our method in few-shot video summarization tasks.
View full abstract
-
Ryuta FUJIMOTO, Takafumi KOSHINAKA
Session ID: 3N1-GS-7-04
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Ryoichi KATSUYA, Toshihiko YAMASAKI
Session ID: 3N1-GS-7-05
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
-
Makoto YUITO, Yusuke SHINOHARA, Daiki MATSUMOTO, Naoya OTAKA
Session ID: 3N4-GS-7-01
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Recently, 3D Inpainting methods have been proposed that leverage NeRF and 3DGS radiance fields to inpaint some areas of image. 3D Inpainting incorporates spatial information, which offers the advantage of spatially consistent inpainting across multiple images. Inpainting images, mask images are required to indicate which areas of the images are to be complemented. However the mask generation methods used in existing 3D Inpainting methods cause some problems such as deviations in consistency due to unnecessarily large mask areas, and poor completion accuracy due to incompleteness in mask areas. In this paper, we propose an effective and efficient method for generating mask images for 3D Inpainting. Experiments in a real-world environment show that 3D Inpainting using our method can improve the above problems.
View full abstract
-
Shunsuke SAKAI, Tatsuhito HASEGAWA
Session ID: 3N4-GS-7-02
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS
Anomaly detection using conventional diffusion models takes an approach where a certain intensity of noise is added to the input image, and anomalies are removed by following the reverse diffusion process learned on normal images. However, this approach has the issue that the noise intensity significantly affects detection performance. In this study, we introduce a novel anomaly detection method using a diffusion model based on image inpainting to address this issue. In anomaly detection based on image inpainting, the masked regions are restored from complete noise, enabling stable detection performance independent of noise intensity. Furthermore, by employing an iterative mask update strategy based on reconstruction error, we improved detection performance compared to a random masking strategy. The proposed method was evaluated on MVTecAD and demonstrated superior performance compared to the baseline and existing anomaly detection methods based on image inpainting.
View full abstract
-
Moyu KAWABE, Ichiro KOBAYASHI
Session ID: 3N4-GS-7-03
Published: 2025
Released on J-STAGE: July 01, 2025
CONFERENCE PROCEEDINGS
FREE ACCESS