Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
Current issue
Displaying 1-50 of 939 articles from this issue
  • Kazuhiro MUKAIDA, Seiji FUKUI, Takeshi NAGAOKA, Takayuki KITAGAWA, Shi ...
    Session ID: 1B3-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We focus on non-functional requirements, which are often overlooked in requirement definitions, and propose a method that allows even those without extensive expertise to efficiently extract and classify non-functional requirements from requirement specifications. Previously, the authors have experimented with classification using models that incorporate pre-trained Transformer models such as BERT and GPT-2. Recently, with the proliferation of tools like ChatGPT, it has become possible to perform classifications solely through prompt interactions. In this study, we explore the capabilities of ChatGPT's Function calling feature, focusing on its potential to yield superior results compared to responses generated solely from prompts and traditional methods. We leverage Function calling to obtain structured data for classification. Additionally, we assess the impact of fine-tuning on ChatGPT and its combined effect. As a result, we were able to significantly shorten the entire process of model creation and learning, achieving accuracy equal to or greater than traditional methods.

    Download PDF (694K)
  • Mitsuki SAKAMOTO, Tetsuro MORIMURA, Yuu JINNAI, Kenshi ABE, Kaito ARIU
    Session ID: 1B3-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Reinforcement learning from human feedback (RLHF) is often used for fine-tuning large-language models (LLMs). The RLHF pipeline consists of four processes: (1) Supervised Fine-Tuning (SFT) of LLMs, (2) ranking, based on human preference, the generated texts by the SFT model, (3) training of the reward model from preference data, and (4) reinforcement learning of the SFT model using the reward model. Due to the cost of gathering human preference data, public datasets are often used to train the reward model. Because the data generation model and the SFT model are different, there is a distribution shift between the data used for learning and the data used for evaluation in the reward model. In this study, to analyze the effect of distribution shift, we create external datasets of preferences, generated using LLMs different from the SFT model. We perform RLHF using these datasets to artificially introduce distribution shift into the RLHF process so that we can elucidate situations where distribution shift poses a problem. Our experimental results show a decrease in the quality of the RLHF model when using external preference datasets, suggesting the impact of a distribution shift.

    Download PDF (452K)
  • Toma TANAKA, Naofumi EMOTO, Yumibayashi TSUKASA
    Session ID: 1B3-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    The objective of this research is to understand the Ability to Understand the Logical Structure (AULS) in Large Language Models (LLMs).In this paper, we first introduce a method inspired by In-Context Learning (ICL), named "Inductive Bias Learning (IBL): Data2Code Model." We then apply IBL to several models, including GPT-4-Turbo, GPT-3.5-Turbo, and Gemini Pro, which have not been previously addressed in research, to compare and analyze the accuracy and characteristics of the predictive models they generate.The results demonstrated that all models possess the capability for IBL, with GPT-4-Turbo, in particular, achieving a notable improvement in accuracy compared to the conventional GPT-4. Furthermore, it was revealed that there is a variance in the performance of the predictive models generated between GPT-N and Gemini Pro.

    Download PDF (297K)
  • Tomokatsu TAKAHASHI, Yuuki YAMANAKA
    Session ID: 1B3-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Anomaly communication detection which correspond to various communication protocols used within Industrial Control Systems (ICS) is essential to ensure the security of ICS. For this purpose, Anomaly communication detection using Bidirectional Encoder Representations for Transformers (BERT) is attracting attention, since this method automatically learns the characteristics of packet payloads and is adaptable to various protocols. However, in anomaly communication detection using BERT, it is difficult to explicitly identify the role of the detected packets in communication and the cause of the anomaly due to the lack of prior knowledge about the anomaly. As a result, users are required to have specialized knowledge in security and communication.To address this problem, this paper considers exploits large language models (LLMs), which have been achieving results in various fields. Specifically, to apply LLMs for multiple tasks performed by users to infer the cause of anomalies, we design prompts and construct Retrieval-Augmented Generation (RAG). Furthermore, through evaluation experiments, we discuss the effectiveness and challenges of applying LLMs to the task of cause inference.

    Download PDF (321K)
  • ZHENYU GAO, AYAKO YAMAGIWA, MASAYUKI GOTO
    Session ID: 1B3-GS-2-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Methods for analyzing image data associated with linguistic information have garnered recent attention but encounter challenges due to varying data quantities across different image domains. In response, LADS was proposed, a model trainable without relying on image data from domains with limited samples, utilizing the embedding space between images and text in image language models. While LADS often employs simple domain description text, adequate text can improve model performance. To tackle this issue, we introduce CoOp, a method that optimizes the domain text in CLIP to enhance accuracy. CoOp achieves this by learning prompts, improving vision language models, and elevating CLIP accuracy. We expect the resulting prompts to represent diverse domains within LADS effectively. Finally, we validate the efficacy of our proposed method by applying it to actual data, demonstrating its ability to address imbalanced data quantities across various image domains.

    Download PDF (353K)
  • Makoto KAWANO, Kazuki KAWAMURA
    Session ID: 1B4-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Real-world machine learning system operation suffers from performance degradation due to data distribution shift, which occurs during operation and leads to lower accuracy compared to model validation. Detecting this performance degradation enables appropriate measures such as model retraining or structural revision. However, continuous labeling of operational data is not realistic due to the high cost. Therefore, this study focuses on estimating the performance of a model on unlabeled test data. Since direct calculation of accuracy on test data is impossible without labels, previous studies have attempted to estimate test accuracy using distances or metrics correlated with it. One such study utilizes adversarial accuracy, but it requires simultaneous adversarial training with the model to be evaluated, rendering it inapplicable to pre-trained models. To address this, we propose CoLDS, a method that estimates the test performance of any model without labels by converting the model to be evaluated into a surrogate model using knowledge distillation and performing adversarial training on the surrogate model. This paper evaluates the effectiveness of CoLDS through experiments and reports the results.

    Download PDF (1061K)
  • Tomohiro ISSHIKI
    Session ID: 1B4-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, the topic of AI has been rising every day. In particular, there are many topics related to neural networks, including generative AI. However, some say that while the results produced by neural networks are good, the basis for this is unclear. Additionally, deep learning has been successful with a number of different models. For example, CNN can already identify objects with a precision that exceeds human recognition. In LLM as well, models that follow the flow of Transformer have achieved remarkable results. However, as with neural networks in general, there are few theoretical research results that explain why these results are obtained. Therefore, this research focuses on the learning of neural networks, and aims to help the theoretical understanding of neural networks by explaining the learning results mathematically from the characteristics of learning and the model being learned.

    Download PDF (490K)
  • Gouki MINEGISHI, Yusuke IWASAWA, Yutaka MATSUO
    Session ID: 1B4-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Grokking is the intriguing phenomenon of delayed generalization: initially, a network achieves a memorization solution with perfect training accuracy and limited generalization solution; however, through further training, it eventually attains a generalization solution. This paper counters previous notions that weight norm reduction explains grokking, by demonstrating through experiments that the identification of optimal subnetworks plays a crucial role in achieving generalization. It leverages the lottery ticket hypothesis to argue that finding these `lottery tickets' is key to transitioning from memorization to generalization. Our research presents empirical evidence, showing that (1) with the proper subnetworks, the delayed generalization does not occur, (2) with the similar weight norm, the dense networks still require substantially longer training to achieve full generalization, (3) with only structure optimization (without updating the value of weights), we can convert the memorization solution to the generalization solution. These results emphasize the importance of subnetwork identification over traditional weight norm reduction theories in explaining grokking's delayed generalization phenomenon.

    Download PDF (1327K)
  • Yuki NAKAGUCHI
    Session ID: 1B4-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Imitation learning solves reinforcement learning problems with reference to some teacher information. While the typical method of behavior cloning could not be applied to long-term tasks because covariate shifts accumulate over time, interactive imitation learning solves this problem by obtaining online feedback from a teacher model. Furthermore, even when the teacher is non-optimal, such as when the task is not exactly the same for teacher and student, if one can use the student's reward information, it is possible to learn faster than reinforcement learning and even surpass the teacher. However, interactive imitation learning requires a teacher who can respond online, which limits applicable teachers. In particular, efficient interactive imitation learning requires a teacher's value function, and applicable teachers are limited to reinforcement-learned models. In this study, we propose a method to extend efficient interactive imitation learning that requires a value function to be applied to teachers with only offline trajectory data.

    Download PDF (312K)
  • Shu LIU, Fujio TORIUMI
    Session ID: 1B4-GS-2-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Complex networks consist of elements (nodes) and interactions between these elements (links), forming a data structure where the strength of interactions is captured by link weights in weighted networks, enabling the modeling of real-world interaction complexities. With significant advancements in machine learning, attempts have been made to incorporate complex networks into machine learning for advanced inference. Particularly noteworthy is the task of node embedding, where similar nodes are mapped close to each other in vector space, preserving their characteristics while mapping them to vectors. This study proposes a method for learning embedding representations that preserve the structural features of nodes in weighted networks. Specifically, the approach involves comparing link weights of adjacent nodes up to a certain number of hops to calculate distances between nodes at multiple scales. Subsequently, weighted multi-layer graphs are constructed based on distances at each scale. Finally, node contexts are generated through random walks, and embedding representations are generated using Skip-gram. The superior performance of this method is demonstrated by confirming the interpretability of embedding representations in toy networks and the structural reproducibility in real networks.

    Download PDF (680K)
  • Shintaro TAMAI, Masayuki NUMAO, Ken-ichi FUKUI
    Session ID: 1B5-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, as interest in health increases, methods have been devised to enable individuals to monitor their sleep status at home. Compared to conventional methods using smartwatches, sensors, etc., methods using sleep sounds have the advantages of being inexpensive, non-contact, and capable of detecting many biological activities. In this study, we aim to construct a machine-learning sleep evaluation model using sleep sounds that can provide evidence.In this study, sleep sound events were first extracted from overnight sleep sounds. Next, latent expressions of sleep sound events were extracted using VAE, and clustered using GMM. Then, we trained an LSTM that estimates the subjective evaluation of sleep using the obtained probability of belonging to each cluster as input data and obtained a sleep evaluation model. Finally, we applied TimeSHAP, a method for interpreting time series prediction models, to the sleep evaluation model to examine the importance of each cluster in sleep evaluation.The experimental results showed that for a given subject, a 94.8% accuracy rate was achieved in determining whether the subject was sleeping well or not for a single night. TimeSHAP, a temporal extension of SHAP, revealed that the types and times of sound events that influence the determination of sleep quality varied from person to person.

    Download PDF (911K)
  • Yuta NAMBU, Masahiro KOHJIMA, Tomoharu IWATA, Haruno KATAOKA, Rika MOC ...
    Session ID: 1B5-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, the widespread use of wearable sensors has facilitated the acquisition of biological signals, and this data have been used to learn emotion recognition models. However, due to the diversification and segmentation of emotion categories and the burden of subjective evaluation, collecting labels exhaustively is becoming difficult, and labeled instances may not be available in advance. When faced with unknown users for whom emotion label data are unavailable, conventional methods cannot effectively recognize emotions. Therefore, we propose a novel learning method for personalized emotion recognition models by introducing meta-learning using behavioral data of multiple people obtained in daily life, even if the unknown user's emotion-labeled data are not available. The results of applying the proposed method to the collected ECGs of several people during video viewing showed that the proposed method outperforms conventional supervised learning and zero-shot learning.

    Download PDF (950K)
  • Kentaro TAKI, Ao GUO, Jianhua MA
    Session ID: 1B5-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Detecting drivers' drowsiness with high accuracy and fine granularity is essential to ensure road safety. With the growing popularity of wearable devices, various physiological signals have become accessible, enabling drowsiness detection anywhere and at any time. Recent studies have achieved multilevel drowsiness detection, identifying up to eight drowsiness states using a single ECG signal. However, the effectiveness of using multiple physiological signals remains unclear. To address this, this study conducted four types of drowsiness detection, each with varying granularity, by utilizing ECG and EMG signals from the DROZY dataset. We first built models for each single modality using CNN and LSTM, optimizing model parameters to identify the best-performing models for each modality. We then built a multimodal model by concatenating the best-performing models for the two modalities. As a result, for fine-granularity drowsiness detection, using multimodal signals outperformed detection only using a single modality of signals. In addition, the optimized model for multilevel drowsiness classification is also identified.

    Download PDF (446K)
  • Weight estimation of cultured fish
    Junya KOBAYASHI, Masashi TSUBAKI, Hideki ASOH, Yui MINESHITA, Ichiro N ...
    Session ID: 1B5-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Fish weight monitoring in aquaculture enhances the potential for improving productivity, profitability, and management quality. Conventionally, farmed fish had been directly weighed after been taken out. Such direct measurements can be time-consuming, and stressful for the fish, negatively affecting their growth performance and even resulting in increased mortality. Recently, weight estimation methods using underwater camera measurements has been developed, but there is still room for improvement in their accuracy. In this study, real measurement data of body shape and weigh of farmed Japanese amberjack, which is one of the most important cultured fish species in Japan, was accumulated across the entire aquaculture stage. Various weight estimation models were constructed based on these data and the estimation accuracy was evaluated. We found that weight estimation model with practically sufficient accuracy can be formulated using the features obtained from the current camera measurements by feature engineering. Additionally, further accuracy improvement can be achieved by adding body width as a new feature. These findings provide valuable insights for aquaculture weight estimation practices.

    Download PDF (410K)
  • Kota FUKAMACHI, Kiri MIURA, Naoki KOBAYASHI, Kenji TANAKA, Atsuyoshi M ...
    Session ID: 1B5-GS-2-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Amidst the expanding e-commerce market, shipping costs are rising due to factors like fuel price hikes and driver shortages, prompting freight rate adjustments. Since shipping fees are often based on the total dimensions of packaging boxes, using smaller boxes is key for cost reduction. This packing challenge, known as the 3D-Bin Packing Problem, is difficult to solve due to its NP-hardness. As a result, numerous heuristic solutions have been proposed to tackle this problem. However, these often overlook practical operational constraints. Our study addresses this by formalizing conditions around placing similar items together and considering their weight. We developed an algorithm to choose the smallest feasible box from multiple options for product group. Applied to real e-commerce logistics data, it selected smaller boxes than current methods in 45% of orders, reducing shipping costs by 3.5%. This indicates that our method can effectively reduce shipping costs while adhering to practical packing rules.

    Download PDF (575K)
  • Shintaro UESUGI, Naoko OMACHI, Katsuyoshi ASADA, Yusei ISODA, Ginji IW ...
    Session ID: 1C5-GS-11-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    UACJ Corporation manufactures and sells aluminum rolled products and controls the properties of the oil used for rolling to ensure product quality. In the manufacturing industry, there is a tendency for domain knowledge to become genus-specific, resulting in a cognitive bias that domain knowledge is correct. Therefore, in this study an assist tool proposed that can visualize domain knowledge and reflect/reframe hypotheses using pLSA and Bayesian network. The analysis of the oiliness data allows us to gain new insights through clustering and to propose new hypotheses based on the insights. However, using this assist tool is a hurdle it is difficult and labor-intensive. Therefore, an operational system is conceived with functions and UI that would allow easy use of this assist tool.

    Download PDF (933K)
  • Hiroka HOSOI, Eiichi SAKURAI
    Session ID: 1C5-GS-11-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Today, more and more people are placing greater importance on improving their own well-being than on pursuing economic gain. Therefore, it has become necessary to take measures to improve the well-being of each individual. In order to achieve this improvement efficiently, it is important to clarify the relationship between the measures and their effects on well-being and the preferred relationship between individuals and their measure. In this paper, using questionnaire data, we analyzed which measures would be necessary for each individual by creating a model that predicts a future decline in the individual's vitality. By a model using a Bayesian network, we showed that a decrease in the frequency of going out and walking speed had an impact on the decrease in vitality.

    Download PDF (1192K)
  • Yudai HIROSE, Satoshi ONO
    Session ID: 1C5-GS-11-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Deep neural networks (DNNs) are used in a wide range of fields and are increasingly being applied to real-world problems. In recent years, there has been an increasing number of efforts to use DNNs to replace human decision-making tasks. However, in such situations, issues such as fairness of output results, ethical validity, and opaqueness of the model have arisen. To mitigate these problems, eXplainable AI (XAI), which explains the reasoning basis of DNNs, is actively studied. On the other hand, it has been revealed that DNN based models have vulnerabilities called Adversarial Examples (AEs), which cause erroneous decisions by adding special perturbations to the input data that are imperceptible to humans. Such vulnerabilities have also been confirmed to exist in image interpreters, which are explainable methods in image classification, and it is essential to investigate these vulnerabilities in terms of AI reliability. This study proposes an adversarial attack method using evolutionary computation and Discrete Wavelet Transform under black box conditions where the internal structure of the attack target model is unknown. Experimental results have shown that the proposed method improved the search efficiency compared to the conventional method.

    Download PDF (684K)
  • An Analysis of Acceptance Corresponding to Social Norms and Personal Beliefs Through Bayesian Estimation
    Soichiro MORISHITA, Masanori TAKANO, Hideaki TAKEDA
    Session ID: 1C5-GS-11-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    As many aspects of people's lives are now conducted on digital platforms, personal data is being actively utilized in a variety of businesses. Knowing what users think about the acceptability of data use for each business should serve as a guideline for the socially acceptable use of data. Based on this idea, the authors have already conducted a Web questionnaire survey on the social acceptability of personal data according to the controller that use the data and the purpose of the processing data are used. As a result, it was found that there was a positive correlation between the acceptability of the use of personal data in terms of socially accepted ideas and the intention of the use of personal data in terms of personal ideas, but there were also cases where the use of personal data was not acceptable but the intention of the use was. Based on these results, this paper analyzes the relationship between the acceptability of implementation and the intention of use using Bayesian inference, based on a Web questionnaire regarding the use of personal data in a personalized media service recommendation system.

    Download PDF (273K)
  • Features to promote sharing and understanding of corporate philosophy and employee values
    Naoko OMACHI, Katsuyoshi ASADA, Yusei ISODA, Ginji IWASE, Shintaro UES ...
    Session ID: 1C5-GS-11-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Against the background of recent declines in the working-age population and rapid technological evolution, innovation has become indispensable for corporate survival. The homogeneity of employees that supported the success of the Japanese manufacturing industry during the era of economic growth is insufficient for innovation. Therefore, the concept of "diversity and inclusion" is emphasized as a management strategy, involving the recruitment of diverse talents and promoting their contributions. While the differences in values among diverse employees can lead to conflicts and trade-offs, actively seeking solutions while acknowledging these differences contributes to the growth of both employees and the organization. Fulfilling individual employees' values strongly influences their well-being and engagement. In corporate organizations aiming for sustainable value creation, it is considered essential for employees to share not only the company's philosophy but also their own, others', and the organization's values, understanding differences. This paper focuses on the design and development of a values-sharing platform with features that structure and visualize abstract values into a shareable form. The aim is to facilitate self-awareness and support the recognition and understanding of value differences, with a report on its effectiveness.

    Download PDF (545K)
  • Takahiro MAESHIMA, Hirama TAKESHI
    Session ID: 1D3-GS-7-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    It is necessary to collect training data to inspect for foreign substances using segmentation. But to collect and annotate images of foreign substances mixed in normal products is costly. It is easy to collect and annotate images of only normal products and images of only foreign substances. In this study, instead of training with many images of foreign substances mixed in normal products, we propose training with images of only normal products, images of only foreign substances and a few images of foreign substances mixed in normal products. As a result, the proposed training method improved inspection accuracy, while reducing annotation cost.

    Download PDF (934K)
  • Miki KATSURAGI, Kenji TANAKA
    Session ID: 1D3-GS-7-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    With the expansion of the e-commerce market and advancements in technology, a detailed analysis of consumer purchasing behavior and understanding of preferences have become crucial. This is particularly true where the visual appeal of product images plays a significant role in consumer engagement. In our study, we utilized multimodal embeddings to analyze the style and nuances of art images on e-commerce sites. Specifically, we employed COCA (Contrastive Captioners as Image-Text Foundation Models) to extract multimodal embeddings that capture the complex patterns and stylistic elements of product images. We then clustered these images into distinct style groups. Our analysis revealed that multimodal embeddings are effective in detecting subtle stylistic changes in images. Furthermore, it suggested that the application of such generative AI could greatly enhance the understanding of image characteristics preferred by consumers.

    Download PDF (573K)
  • Kazuki MATSUDA, Yuiga WADA, Komei SUGIURA
    Session ID: 1D3-GS-7-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In the field of image captioning, constructing automatic evaluation metrics that align closely with human judgment is crucial for effective model development. A key challenge in this field is addressing hallucinations, which are instances where models generate words unrelated to the image, a frequent issue in image captioning. Existing metrics often fail to manage hallucinations, primarily due to their limited capability in contrasting candidate captions against a diverse range of reference captions. To overcome this, we propose DENEB, a novel metric for image captioning, specifically robust to hallucinations. DENEB incorporates the Sim-Vec Transformer, a mechanism capable of processing multiple references and extracting similarity vectors effectively. Additionally, to train DENEB, we have expanded the Polaris dataset to create Polaris2.0, significantly enhancing supervised automatic evaluation metrics. Our dataset comprises 32,978 images and 32,978 human judgments from 805 annotators. Our approach achieved state-of-the-art performance on Composite, Flickr8K-Expert, Flickr8K-CF, PASCAL-50S, FOIL, and the Polaris 2.0 dataset, thereby demonstrating its effectiveness and robustness to hallucinations.

    Download PDF (503K)
  • Arata SAITO, Takuya MATUZAKI
    Session ID: 1D3-GS-7-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We have developed a method for detecting reading errors in Japanese speech data. First, speech recognition is performed to transcribe a speech to the form of a phoneme sequence, and then it is checked whether it includes reading errors. In order to distinguish between errors in speech recognition and actual reading errors, we create a candidate list of reading errors for each morpheme, select the one with the smallest edit distance from the speech recognition result among the correct answer and the candidate reading errors, and detect it as a reading error if it is different from the correct reading. We conducted experiments on speech data in the LaboroTVspeech corpus and the Japanese Spoken Language Corpus, as well as synthetic speech. The results confirmed that the method is effective when the speech actually contains reading errors, although there were many cases in which reading errors were mis-detected even when the correct reading was made. In particular, in experiments with synthesized speech, the method was able to accurately detect misreading in 80.0% of the cases, including how a word was mispronunciated, and succeeded in detecting 98.6% of wrongly pronunciated morphemes.

    Download PDF (743K)
  • Shinnosuke HIRANO, Tsumugi IIDA, Komei SUGIURA
    Session ID: 1D3-GS-7-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In the modern era where deep learning is applied across a wide range of fields, the explainability of models is of paramount importance. However, existing methods are not optimized for vision-language foundation models, leading to lower explanation quality for such models. Therefore, this study proposes the Alternative Adapter Model, an explanation generation model tailored to vision-language foundation models. By introducing a Side Branch Network connected to the vision-language foundation model, the proposed method extracts features suitable for explanation generation. Furthermore, by implementing the Alternative Epoch Architecture, which dynamically changes the outputs of modules and the layers to be frozen, we address the issue of overly narrow focus areas. To evaluate the proposed method, experiments were conducted using the CUB-200-2011 dataset. The results demonstrate that the proposed method surpasses existing methods in mean IoU, Insertion Score, Deletion Score, and Insertion-Deletion Score, which are standard metrics for visual explanation generation tasks.

    Download PDF (933K)
  • Akari KUBO, Kentaroh SANO, Masanao KOTANI
    Session ID: 1D4-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Aiming to develop technology that facilitates behavioral change for energy conservation in home, our study examined behavioral vectorization and behavioral network analysis. These technologies are designed to discern the behavioral tendencies of individuals and groups, and to examine the connections between behaviors. We installed sensors in the living spaces of households, converting behavioral data into text. We then applied Word2Vec to this text data, learning vector representations of words that depict various behaviors in these living spaces. By clustering these words, we visualized individual behavioral tendencies. The result provided insights into customizing behavioral change support according to individual characteristics. Additionally, through the analysis of behavioral network created based on clustering, we identified the central behavior within the group and similarities in behavioral patterns among members. These findings suggest that understanding the behavioral tendencies can lead to effective interventions, thereby enhancing the impact of behavioral change support.

    Download PDF (477K)
  • Masataka YUASA, Hideaki UCHIDA, Yohei YAMAGUCHI, Yoshiyuki SHIMODA
    Session ID: 1D4-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Recently, HEMS (Home Energy Management System), manage and optimize the energy consumption of a house, has been spreading toward the realization of a decarbonized society. Although the detailed electricity consumption data measured by HEMS is useful from an energy management perspective, its utilization methods are not yet fully established. Therefore, this study aims to clarify the characteristics of residents' lifestyles by extracting states of rooms and home appliances using GMM (Gaussian Mixture Model) and identifying state transitions with HMM (Hidden Markov Model) based on HEMS data. In addition, the accuracy of this method will be evaluated through surveys conducted with the residents. As a result, it was possible to extract the usage status of each room and appliance, living patterns such as sleeping and going out with relatively high accuracy from the HEMS data alone.

    Download PDF (482K)
  • Keigo TSUTSUI, Phuoc Thanh TRAN-NGOC, Hirotaka SATO, Takashi MATSUBARA
    Session ID: 1D4-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    A complex system is a setup where many simple parts interact, creating certain overall effects. Examples of complex systems around us include weather patterns, economic functions, group movements of insects or birds, and the Internet's structure. Modeling these systems enables us understanding realistic phenomena and building detailed simulations. It is hard to directly model the entire system, but on the other hand, modeling individual components does not represent the behavior of the entire system. In this paper, we modeled the entire complex system by decomposing it into components and installing methods which resolved nature of each component. In the experiment, we modeled dynamics of insect group behavior. Considering invariances of the system, we proposed a method involves two tricks which were effective for precisely modeling the system which resulted in performing high accuracy.

    Download PDF (780K)
  • a machine learning approach using clustering of shooting styles based on Wasserstein distance incorporating dynamic features and clustering of offensive roles
    Kazuhiro YAMADA, Keisuke FUJII
    Session ID: 1D4-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In a basketball game, the players compete against each other in a five-on-five match. In particular, it is important for players with different playstyles to cooperate and score efficiently during possessions, which take place many times in one game. In a previous study, the compatibility of players was examined using clustering results based on each player's statistics, called stats, but the findings obtained were considered to be limited due to the method of selecting features that included both offense and defense. This study focuses only on offense and aims to examine more specifically the impact of player combinations on scoring efficiency. In this study, two different methods are used to capture the playstyles of players on offense: one is a newly proposed method that clusters the tendency of shots based on the Wasserstein distance, the distance between distributions, which considers the set of shots of each player as a probability distribution using shooting features created from tracking data. The other is a method for clustering players' roles in the offense, which is a modification of the existing method. By creating and interpreting a machine learning model that predicts stats representing scoring efficiency from information on lineups based on these two clusterings, new insights into the compatibility of players were obtained.

    Download PDF (316K)
  • Shengzhou YI, Toshiaki YAMASAKI, Toshihiko YAMASAKI
    Session ID: 1D4-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In our earlier study, we introduced a multimodal neural network designed to assess online interviews and appraise candidates' performance. However, the previous study focused solely on a subset of evaluation criteria named question items that assess distinct sections within the interview process. In this study, our evaluation criteria are extended to observation items that encompass the entire interview process rather than targeting specific sections. Because some samples lack audio modality, we use prompt learning to discern between the samples with completed modalities and those without audio modality. Furthermore, we apply the re-sampling method and margin ranking loss to improve the model robustness on imbalanced distribution. For the experimental results, the prompt learning and class-imbalanced learning methods improved the prediction accuracy, and the proposed model finally achieves an average accuracy of 67.41% in binary classification for the extended eight criteria, providing a more holistic assessment of candidate performance.

    Download PDF (184K)
  • Hiroto HORIMOTO, Ryusei KIMURA, Takahiro TANAKA, Shogo OKADA
    Session ID: 1D5-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Automobiles are essential to society, but accidents involving older drivers have risen. Driving assistance systems have been developed to prevent this problem. The next goal is to implement a system that provides adaptive assistance suitable for drivers with different characteristics. Thus, accurately estimating drivers' characteristics is crucial. Some studies use in-vehicle sensor data through a Controller Area Network (CAN) for this estimation but require additional equipment for data collection. This study focuses on developing a psychological driving style recognition model from Global Positioning System (GPS) data, which can be easily accessible. The experimental results show that the model with GPS data achieved F1-macro and AUC greater than the random-assignment baseline on seven and eight items of the Driving Style Questionnaire (DSQ), respectively. Furthermore, the results suggest that this model works well for DSQ estimation when comparing the model with CAN data. This GPS-based model contributes to developing personalized systems.

    Download PDF (346K)
  • Ryoma NAKAMURA, Masaki MATSUDAIRA, Daisuke OKUYA
    Session ID: 1D5-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We are studying a method to detect traffic flow anomalies using probe data. This is expected to achieve early and automatic detection of traffic anomalies as an alternative to human monitoring. This paper proposes a method to detect anomalies using probe data and reports the results of applying the proposed method to actual probe data. The proposed method detects anomalies through three major decision processing stages. First, the occurrence of traffic congestion is determined from the speed of the probe data. Then it considers whether there has been a sudden increase in vehicles making sudden brakes or turn at that location. Finally, it checks whether the congestion is due to traffic concentration based on past statistics, and if it is not determined to be congestion due to traffic concentration, an anomaly is reported. The results of the evaluation of our methods with acutual data confirmed that the method can detect anomalies with high precision, and that anomalies can be detected at timings equivalent to human surveillance in most traffic anomalies.

    Download PDF (617K)
  • Ryota MIMURA, Kota SHIMOMURA, Atsuya ISHIKAWA, Osamu ITO, Kazuaki OHMO ...
    Session ID: 1D5-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Consideration of traffic risk in driver assistance systems and automated driving technology is important in preventing traffic accidents. Traffic risks are considered to be contained in image information. However, it is difficult to explain traffic risk in driving scenes from image information alone, and research in this area has not yet progressed sufficiently. In this study, we propose a multimodal framework that can explain traffic risks by using GIS data and street images. This framework identifies the coordinates of high-risk areas from traffic accident risk maps created based on GIS data and trains a multimodal network using street images associated with those areas. By doing so, we construct a framework that effectively explains traffic risk in an arbitrary scene. Experimental results show that the proposed framework can generate captions that explain traffic risks for high-risk areas based on GIS data.

    Download PDF (517K)
  • Atsuya ISHIKAWA, Koki INOUE, Kota SHIMOMURA, Kazuaki OHMORI, Ryuta SHI ...
    Session ID: 1D5-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    With the spread of driver assistance systems and autonomous driving technologies, their effectiveness in reducing traffic accidents has been discussed. However, for a further reduction of accidents, it is crucial to explain traffic accident risks and analyze their mechanisms. Research on explainable multimodal networks for driving scenes has attempted methods for generating captions by considering recognizable objects using metadata. Such methods typically focus on generating captions for dynamic objects, like humans. However, to explain traffic accident risks in driving scenes, static risks caused by road signs and road structures should also be considered during caption generation. Existing large-scale multimodal networks face difficulties in generating captions that address these types of road environment risks. To tackle this challenge, we propose a caption generation method that leverages prompt engineering to include both dynamic objects and static potential risks. Additionally, experiments using the generated captions confirmed the capability of producing captions that consider both dynamic objects and static potential risks.

    Download PDF (319K)
  • Takumi YAMAMOTO, Yuichi SEI, Yasuyuki TAHARA, Akihiko OHSUGA
    Session ID: 1E5-GS-5-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, not only mastering the games but also other fields have also been attracted regarding reinforcement learning. Cooperative game AI research has mainly focused on multiplayer games. However, collaboration tasks that humans and AI control one character have not been widely studied. This study focuses on cooperative manipulation of the fighting game character. We have proposed an AI that supports people in the fighting game:support AI. However, since only one random player was used for training support AI, it couldn ’t cooperate well with the players. Therefore, we used three different types of players in the training of support AI. These players weren ’t random but attack, balance, and defense AIs, each of which trained with different rewards. In the experiment, we asked subjects to use support AIs. there wasn’t much different between the support AI and the proposed method at the result of the Subjective evaluation. However, game scores were shown to be about 7.7% higher for the the proposed method

    Download PDF (704K)
  • Riko NAKAZATO, Katsuhide FUJITA
    Session ID: 1E5-GS-5-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Supply chain ordering management (SCOM) attracts attention due to structural changes in supply chain (SC). SCOM can be modeled by reinforcement learning, in which the SC is regarded as an environment and the companies belonging to the SC as agents. Most previous studies premise that each agent's information is shared to all agents in the SC. But actually it is difficult for companies to disclose their own information to other companies without hiding it, and companies can only communicate with each other based on partial information. Therefore, a learning model that appropriately sets the range of information that each agent shares is necessary. This study focused on linear multi-stage SC and proposed a deep reinforcement learning model that determines an ordering policy that maximally reduces inventory costs while restricting the range of information shared among agents. The experimental results demonstrated that the proposed model can achieve the better inventory cost than the previous study's one.

    Download PDF (499K)
  • Yoshitaka ISOBE, Koichi MORIYAMA, Atsuko MUTOH, Kosuke SHIMA, Tohgotoh ...
    Session ID: 1E5-GS-5-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In a multi-agent environment where multiple agents exist, it is often impossible to maximize the rewards of all agents simultaneously due to interference among agents. Therefore, it is difficult to learn cooperative behavior with reinforcement learning, which pursues the maximization of rewards. On the other hand, under the intrinsically motivated reinforcement learning (IMRL) framework, which refers to multiple pieces of information when learning and making decisions, Sequeira et al.\ identified a useful evaluation function for decision making in single-agent environments with genetic programming (GP). In this study, we apply this approach to a multi-agent environment. We test whether GP can identify a useful evaluation function for learning cooperative behavior of multiple independently learning agents to capture some preys in a pursuit problem.

    Download PDF (299K)
  • Keisuke FUJII, Kazushi TSUTUSI, Atom SCOTT, Hiroshi NAKAHARA, Naoya TA ...
    Session ID: 1E5-GS-5-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    When modeling real-world biological multi-agents with reinforcement learning, there is a domain gap between the source real-world data and the target reinforcement learning environment. Therefore, the target dynamics are adapted to the unknown source dynamics. In this study, we propose a reinforcement learning method that uses information obtained by adapting source action to target action in a supervised manner as a method for domain adaptation in multi-agent reinforcement learning from real-world demonstrations. In limited situations such as 2vs1 chase-escape, 2vs2 and 4vs8 in soccer, we show that the agent learned to imitate the demonstrations and obtain rewards compared to the baseline.

    Download PDF (580K)
  • Tomoki JINNO, Taku ISHIGANE, Naoki INOUE, Kei WAKABAYASHI
    Session ID: 1E5-GS-5-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Observational learning is a method of learning by observing the behavior of others. While the mechanisms and conditions for the emergence of observational learning have been studied using biological approaches, research methods using computational models have attracted attention in recent years. However, previous research using computational modeling has been limited to experiments on specific types of observational learning. In this study, we examined conditions for the emergence of more complex observational learning from two perspectives: the external conditions, such as the environment and reward values, and the internal conditions of the reinforcement learning algorithms. Experimental results revealed that the task could only be mastered using the method proposed PT-SEAC in this study, under high difficulty conditions that existing reinforcement learning algorithms struggled to achieve the task. These results suggest that cognitive functions may play an important role in complex observational learning in order to share the behavior of the other agent as their own experience.

    Download PDF (647K)
  • Yuma YOSHINAGA, Shinichiro MANABE, Osamu TORII
    Session ID: 1F3-GS-1-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, in the manufacturing fields, there has been considerable movement for analyzing data obtained from the observation of products and to confirm factors that determine the individual features. For actual analyses, although there is a large number of candidate factors, the number of data whose features are observable is less; consequently, factor inference is often difficult. Lasso is considered to be one of the effective solution to this problem. This study focuses on objects that satisfy the following two conditions: (1) Multiple different features being observed in the same individual, (2) Each feature being represented by a countable value that follows a Poisson distribution. In this study, we extend Lasso to suit the object satisfying these two conditions, formulate and derive a solution algorithm. We demonstrate that the proposed method can estimate the factors more accurately for synthetic data than the existing Lasso.

    Download PDF (776K)
  • Akira KITAOKA, Riki ETO
    Session ID: 1F3-GS-1-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    This paper deals with an inverse optimization problem (IOP) for mixed integer linear programming (MILP), which is, for example, to find which aspects are important in shift scheduling. Solutions to the IOP for convex programming exist in known methods. However, in the inverse problem of MILP, there was no efficient way to reduce the prediction loss to zero. To effectively minimize the prediction loss in MILP, this paper attributes it to the problem of minimizing the suboptimality loss with equivalence of suboptimality and prediction loss. In MILP, there exists γ,ε>0 such that for almost every true weight, we estimate the error between the true and learning weights as O(exp(-γk1/2 +(1/ε2)log k )), where k is a number of updates in projected subgradient method, and we reduce the prediction loss to zero in a finite number of updates. We confirm these in numerical experiments.

    Download PDF (310K)
  • Yuta TSUCHIYA, Masaki HAMAMOTO
    Session ID: 1F3-GS-1-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Schedules obtained from optimization engines often contradict human intuition. “Why did the optimal plan include something that I would not choose?” is one of the fundamental questions in eXplainable AI Planning (XAIP). Perturbation-based explanations evaluate the effect of input factors such as constraints and variables on counterintuitive states in solutions. However, a huge computation time is required to solve the optimization problem for all cases where the candidate factors exist or not. In this paper, we propose an accelerated branch and bound method for repeated computations in perturbation-based explanations. This method not only reuses the optimal solution under different constraints but also employs a relaxed search criterion: exploring whether an optimal solution exists within the state of interest, instead of seeking the solution itself. Through numerical experiments of the typical personnel assignment problem, we show that our approach could reduce the calculation time under various parameter settings.

    Download PDF (557K)
  • Daiki MORINAGA, Youhei AKIMOTO
    Session ID: 1F3-GS-1-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Evolution Strategy (ES) is one of the promising class of algorithms for the Black-Box Optimization (BBO) in which an algorithm queries only the objective function value. Despite its practical success, theoretical analysis of continuous BBO algorithm is still underdeveloped. In this study, the convergence rates of the worst case and the best case of the (1+1)-ES are derived on $L$-strongly convex and $U$-Lipschitz smooth function and its monotone transformation. It is proved that the order of those rates is proportional to $1/d$, and in the worst case, the convergence rate is proportional to $L/U$. These results show that the convergence rate of the (1+1)-ES is competitive to those of other derivative-free optimization algorithms that exploit $U$ as a known constant.

    Download PDF (291K)
  • Akihiro TAKEMURA, Katsumi INOUE
    Session ID: 1F3-GS-1-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We propose a method that integrates data-driven approaches and symbolic reasoning in Neural-Symbolic AI (NeSy). This method evaluates implication rules and constraints in a differentiable way by using the output of neural networks and logic programs embedded in matrices, enabling efficient learning under distant supervision where direct labels are not provided. When the number of training data was fixed, our method achieved accuracy comparable to or higher than the existing methods in most tasks and completed the learning process faster than the existing methods. These results demonstrate the effectiveness of our proposed method as an approach for achieving high accuracy and rapid learning in NeSy.

    Download PDF (287K)
  • Gen LI, Takeichiro NISHIKAWA, Yousuke ISOWAKI, Kazuki ISE, Naoki KUROK ...
    Session ID: 1F4-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Machine learning molecular dynamics (MLMD) have gained attention due to their ability to simulate large-scale and long-time simulations of materials that were previously impossible. Despite recent progress in force prediction accuracy on machine learning force fields, high force accuracy doesn’t always guarantee simulation’s success. In this study, we investigate the factors contributing to simulation failure and proposed a novel loss function which can lead to simulation success. Our analysis using the MD17 dataset reveals that light atoms are abnormally close to other atoms frequently, and acceleration error for light atoms is relatively large. Further, new loss function which takes acceleration error into account, has been shown to prevent simulation failure or extend the time until failure. Therefore, we assume that reducing the acceleration error is important for machine learning force field.

    Download PDF (497K)
  • Yin Kan PHUA, Tsuyohiko FUJIGAYA, Koichiro KATO
    Session ID: 1F4-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Functional polymers are essential materials supporting modern society and are being actively researched in an experiment-centric manner. Use of artificial intelligence (AI) or machine learning (ML) are expected to further accelerate research efficiency, but low transparency and interpretability of AI deter researchers from trusting it. This study built an explainable AI (XAI) to predict property of anion exchange membrane, a kind of functional polymer. This study is conducted in four steps: 1. Construction of an in-house database (DB); 2. Digitization of polymer structure using existing descriptors; 3. Construction of an ML model; 4. Calculate and analyze the Shapley (SHAP) values for each explanatory variable for evaluating explainability and transparency. Open DB is not available for the target material in this study, hence an in-house DB consisting structural property data of around 300 polymers was built. From 2. and 3., we built an ML model with test data prediction accuracy of 0.7983. Carrying out 4., we found that AMID_N, a descriptor-origin explanatory variable highly correlating polymer substructure and its property, is important. The findings of such important features strongly support chemical interpretation, thereby successfully obtaining a XAI that can be used in experiment cycle.

    Download PDF (656K)
  • Satsuki NISHIMURA, Coh MIYAO, Hajime OTSUKA
    Session ID: 1F4-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In theoretical particle physics, various hypothetical models are proposed to explain unsolved problems of particle phenomena. To validate them exhaustively, theoretical predictions should be compared with experimental data. However, the parameter space is large in general, so it is difficult to analyze the models numerically with low cost. In this work considering such a situation, we focus on matter particles that are called as quarks and leptons, and improve a method to explore their flavor structure with reinforcement learning. We utilize Deep Q-Network for one kind of the models, and train neural networks on the integer charges of quarks and leptons. The results show that there are indeed solutions that reproduce the experimental and renormalized masses of the quarks and leptons. On the other hand, the results suggest that appropriate parameters are very scarce when considering domain-wall problems, which are severely constrained by cosmological observations. Given its usefulness for such analysis, we expect that reinforcement learning can be applied to the verification of realistic particle models.

    Download PDF (304K)
  • Tsuyoshi ISHIZONE, Yasuhiro MATSUNAGA, Sotaro FUCHIGAMI, Kazuyuki NAKA ...
    Session ID: 1F4-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    With the development of machine learning technology and improvements in computational power, many new protein structures have been revealed. Among them, AlphaFold2 has brought a breakthrough in protein structure prediction; however, most of the structures that have been elucidated so far are only the most stable structures, and there are still issues to be solved for proteins with multiple stable states or no stable states. It is said that approximately 30 percent of proteins are naturally intrinsically disfolded proteins with unstable structures, and it is essential to elucidate the dynamic mechanisms of the proteins concerning their biological functions.Generally, protein dynamics is described by molecular dynamics simulation. Still, since it is a stochastic calculation, a huge amount of computational time is required to cover the manifold regarding transition. Enhanced sampling (ES) reduces computational time by accelerating the search by adding a bias to the potential. This study proposes a representation learning method for leading to ES potentials. The proposed method is a contrastive learning-based method, and we show that it can construct embeddings suitable for capturing conformational dynamics.

    Download PDF (295K)
  • Takuma SHIBAHARA, Yasuho YAMASHITA, Tatsuya OKUNO, Takaharu HIRAYAMA
    Session ID: 1F4-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Recent advances in machine learning have generated interest in its application to drug discovery. Several models have been developed for generating molecular structures, including character-based models that encode these structures as strings, graph-based models that capture atomic bond connectivity, and 3D-based models that depict the spatial positions and bonding of atoms. This study focuses on character-based generative models which promise to interpret complex instructions regarding compound attributes through natural language, facilitating the targeted generation and refinement of molecular structures. The approach developed harnesses Large Language Models (LLMs) to create compound structures by conducting additional pre-training. The experiments involved adapting the LLaMA-2 7B model with a dataset of small molecules. The efficacy of the adapted model was compared against the JT-VAE, a graph-based generative model tailored for compounds, utilizing the MOSES benchmark for evaluation. Our findings suggest that the LLaMA-2 7B model has potential in advancing the field of drug design, as it competes and shows superiority in compound generation over the JT-VAE.

    Download PDF (327K)
  • TOMOKA AZAKAMI, HIROKI ASAI
    Session ID: 1F5-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    This research introduces a unique approach to examining causal relationships amongst attributes of specific customer groups who conducted contract procedures on web pages using causal discovery. The aim here is to lessen the processing burden during causal exploration and uphold accuracy. To tackle this challenge, we structure customer groups into tiers, depending on their web page trajectories, take out correlated attributes, and suggest a process for causal exploration within each tier. The experiment's outcomes substantiate that this strategy mitigates the processing load by lessening the quantity of attributes processed concurrently, while also generating a graph signifying causal relationships among attributes. This method offers an efficient strategy for scrutinizing causal relationships in customer groups relying on their web page visit history.

    Download PDF (553K)
feedback
Top