Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Displaying 151-200 of 939 articles from this issue
  • Yuka HASHIZUME, Atsushi MIYASHITA, Li LI, Tomoki TODA
    Session ID: 1O4-OS-29a-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    To achieve a flexible MIR system, it is desirable to calculate music similarity by focusing on multiple partial elements of musical pieces and allowing the users to select the element they want to focus on. Our previous study proposed the use of each instrumental sound signal to calculate music similarity with each instrument-dependent network, but using each sound signal as a query in search systems is impractical. In this paper, we propose a method to compute similarities focusing on each instrument with a single network that inputs mixed sounds. We design a single similarity embedding space with disentangled dimensions for each instrument, extracted by Conditional Similarity Networks, which is trained by the triplet loss using masks. Experimental results show that (1) each sub-embedding space can hold the characteristics of the corresponding instrument, and (2) the selection of musical pieces by the proposed method can obtain human consent in limited conditions.

    Download PDF (862K)
  • Ryosei KAWAGUCHI, Haruhiro KATAYOSE
    Session ID: 1O4-OS-29a-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, media generated by generative AI has received a lot of attention. Recently, Diffusion-based text to music automatic composition systems have been attracting attention. This research focuses on the commonality that both music and language can be expressed symbolically, and explores the ability to compose music on GPT-4 by treating music as a plain text expression using ABC notation. As a result of the experiment, we confirmed that the text has a certain compositional ability by matching the latent space architecture and musical knowledge. Based on this, we developed the "Grazie Piano Tuner", a composition support system, which has the ability to change the melody by controlling the emotional parameters. Currently, we are working on implementing a means to control emotional parameters as time-series information. In the presentation, we will discuss the possibilities and challenges of a composition support system using LLM while introducing actual examples using this system.

    Download PDF (1262K)
  • Moyu KAWABE, Ichiro KOBAYASHI
    Session ID: 1O4-OS-29a-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Diffusion process-based models have been attracting attention in the field of music generation in recent years due to their high quality and scalability. Research has also been conducted to generate music on demand using diffusion models. However, it is not easy to control for complex attributes in diffusion models. In addition, there have not been many studies on music generation with an emphasis on emotion, which is closely related to music. In this study, we aim to develop a method that can generate a variety of music using a diffusion model, taking emotion as an input and controlling it according to the musical attributes corresponding to the emotion. For the diffusion model, we used the Diffusion-LM method, which can be controlled by using a classifier at each time denoising stage, and the classifier uses musical attribute values to identify emotions and generate music based on the input emotion information.

    Download PDF (656K)
  • Hiroaki TAKAHA, Masako HIMENO, Mahiro WATANABE, Ikuko SHIMIZU
    Session ID: 1O4-OS-29a-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In highly specialized music education, we introduce some examples of the possibilities for effective education by utilizing various information processing technologies. For example, a teaching material for learning complex rhythmic deviations, and a teaching material that provides feedback to students by analyzing images of their 3D posture during performance. Further developments will be discussed.

    Download PDF (615K)
  • Gou KOUTAKI
    Session ID: 1O4-OS-29a-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We are developing a semi-automatic musical instrument playing system for wind and string instruments in which keys and strings are automatically fingered by a machine, and blowing and string plucking are done by a human. This allows even beginners and those who cannot move their hands well to experience music. Furthermore, by using multiple semi-automatic instruments, it is possible to create an ensemble. This paper introduces the system, performance examples, and future prospects.

    Download PDF (1553K)
  • Tsutomu FUJINAMI
    Session ID: 1O5-OS-29b-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Pandemic prohibited musicians from performing the public spaces. The situation forced musicians to explore the possibility of Internet technology such as remote communication tools. Musicians played music at multiple sites and delivered it to the listener via Internet. The incident opened our eyes to the varieties of forms in which music can be played. We organized a concert on 22nd October 2023, where two groups of musicians played at different sites, one in Kanazawa and the other in New York. They played in the first half of the performance “Ryoanji,” composed by John Cage. A program received the sounds from both venues to manipulate the visuals to be shown to the audience at both sites. It incorporated some ideas of artificial life to make the visual spontaneous. The animation was recorded and served as score for the second half of the performance. Musicians at both sites saw the same visual and improvised on it with the instruction by the director. The piece was designed by the composer, Tomomi Adachi. The performance was well appreciated at both sites. Apart from technical challenges, it raises a question, how we can make spontaneous music distributed around the world. Our performance is an answer to the question.

    Download PDF (618K)
  • Dohjin MIYAMOTO, Shuhei OGAWA, Naomi KOBAYASHI, Maiko KODAMA
    Session ID: 1T3-OS-32a-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    There has been a recent increase in usage of Science Fiction Prototyping (SFP) for new business development and corporate vision development within organizations. This approach involves creating, sharing, and discussing science fiction works to co-create visions of the future. In SFP, there has been experiments with the use of AI in workshops. However, the methodology and challenges of SFP itself is not yet fully organized, and there is a lack of accumulated knowledge on when AI integration is desirable. Furthermore, there is a demand for the use of AI in evaluating science fiction works. This study outline opportunities and challenges of using AI in the field of SFP, from various perspectives including business administration, psychology, linguistics, and behavioral science. We organized such opportunities and challenges according to the three phases of the SFP and suggests that there is potential for further collaboration between SFP practitioners and AI researchers.

    Download PDF (393K)
  • Narrative Generation through a Large-Scale Language Model Facilitated by Conversational Robots Enabling Human Participation
    Mayu OMICHI, Hideyuki TAKAHASHI, Midori BAN, Takamasa IIO, Yohei YANAS ...
    Session ID: 1T3-OS-32a-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    This study examines whether robots equipped with backstories created using LLM can maintain human interest through dialogue. In the present system, the robot not only has its backstory and develops it using LLM, but also interacts with humans based on the content of the backstory and generates a story based on feedback obtained from the human in the interaction. In this presentation, we will report the results of our verification through evaluations by dialogue participants and third parties to see if developing such a system that allows users to be involved in the generation of the robot's backstory can contribute to sustaining interest in the robot.

    Download PDF (396K)
  • Keisuke SATO, Kunhao YANG, Kazuhiro UEDA
    Session ID: 1T3-OS-32a-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In order to realize an idea support system, it is important to identify the characteristics of information that individuals should be exposed to in order to achieve a high level of creativity. We collected and statistically analyzed datasets related to three large online communities (Cities: Skylines, SCP-wiki, and Archive of Our Own) engaged in mod development and novel writing to examine whether the quality and diversity of other people’s ideas referred to have a positive effect on idea generation. Our analysis revealed the following three findings: (1) the quality diversity of reference ideas has the most positive impact on the quality of generated ideas when it is neither high nor low, (2) the content diversity of reference ideas has a negative impact on the quality of generated ideas, and (3) the quality of reference ideas has a negative impact on the quality of generated ideas when it is extremely high.

    Download PDF (1279K)
  • Hajime MURAI, Mizuki AOYAMA, Shoki OHTA, Takaki FUKUMOTO, Arisa OHBA, ...
    Session ID: 1T3-OS-32a-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In order to realize cross-genre automatic story generation, a hybrid automatic story generation model was adopted. The hybrid model combines automatic story structure generation based on structure analysis for existing works, and text generation by a large language model. At first about 1500 highly rated Japanese entertainment stories were selected and analyzed from five genres, "Adventure", "Battle", "Horror", "Love", and "Detective". 17 story factors which correspond to frequently appeared story plots were statistically extracted. After that, a story structure system was developed utilizing 17 story factors. The resultant structures were converted to a prompt, and final plots were generated by a large language model. This system is able to generate stories that are understandable and reflecting extracted 17 story factors.

    Download PDF (717K)
  • Kengo WATANABE, Takashi KAWAMURA, Reo KOBAYASHI, Kzuma ARI, Akifumi IT ...
    Session ID: 1T3-OS-32a-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, the fluidity of media contents has rapidly expanded, and with it, the burden on story creation has increased. AI technology, especially large language models (LLMs), has shown remarkable development, and there are high expectations for its utilization, though methods for its use have not yet been established. There is a demand for systems that provide beneficial support to story creators unfamiliar with AI. Therefore, we have developed an interactive story generation system as a web application that realizes the use of LLMs based on narrative structure analysis through intuitive operations. This system functions as a go-between for story creators and LLMs, supporting story creation. When used by several groups of professional story creators, the system received high satisfaction for its ease of use and the production flow that entails repeatedly applying LLMs to refine the narrative. Through this research, the usefulness of our system in the methodology of utilizing LLMs in story creation was suggested.

    Download PDF (636K)
  • Yuki ITOH, Kosuke SASAKI, Masakazu MORIGUCHI, Hisashi NODA
    Session ID: 1T4-OS-32b-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We try that take advantage of Science Fiction Prototyping (SFP) for creation of new business. SFP is methodology that we discuss and share about vision of future with someone by creating vison of no existent future through Science fiction. "Creation of new business" mean planning new business based on one's ideal vision of future. In this study, we make new framework of SFP for creation of new business. This framework has six phases. In Prepare phase, presentation tha way of thinking on SFP. In Design phase, deciding theme and outline of creation based on one's ideal vision of future. In Reveal phase, refining worldview of this creation by writing novel. In Sympathy phase, summarizing product or service that in worldview of this creation. In Backcast phase, backcasting from worldview of this creation. In How Now phase, organizing what to do for now based on backcasting. We can create helpful worldview to create new business by think based on one's ideal vision of future using this framework. Therefore , we held workshop that using this framework for 4 perticipants. In this paper, we discuss the effects and issues of this framework.

    Download PDF (487K)
  • Tomoya MINEGISHI, Hirotaka OSAWA
    Session ID: 1T4-OS-32b-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In this study, the authors propose a virtual agent system to support idea creation in SF prototyping. Among the workshop methods for creating the future, SF prototyping is useful for exploring potential possibilities by removing constraints on participants. However, the participants are required to create ideas actively during the discussions, which is adding stress to the participants. The authors consider that the situations when the participants are added stress during the workshop are the time when everyone is silent, and at this time the agent creates ideas and tells them to the participants. The pre-experiment revealed the group that the agent provided a support for idea generation to the participants, there was a significant reduction in the silence time during discussions compared to the group that the agent did not provide support.

    Download PDF (576K)
  • Nakamura nac KENJI
    Session ID: 1T4-OS-32b-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    The development of generative AI has been remarkable in recent years, and the sentences generated by ChatGPT4 are indistinguishable from those of humans. As of August 2023, the accuracy of image generation AI is also high, and it is now possible for generation AI to replace some of the actions that have been performed by humans. While there are concerns about its negative impact on children's learning in schools, it is also expected to be a tool to encourage imagination. However, generative AI requires the use of prompts to give instructions, which can be difficult for children in their early stages of development. We conducted a basic study to examine whether prompts change between boys and adolescents, who have different vocabularies. Ten juveniles (average age 10.8 years, 3 males and 7 females) and 10 adolescents (average age 18.1 years, 5 males and 5 females) were recruited from the public and asked to create an image of the image they wanted to generate. We then investigated the prompt changes to arrive at the image as imagined. The results showed that the number of prompts in adolescence was by far the largest. The results of the interview survey and the prompt changes in juveniles and adolescents are reported.

    Download PDF (313K)
  • Shinzan KOMATA, Yuki ZENIMOTO, Takehito UTSURO
    Session ID: 1T4-OS-32b-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    The speaker to dialogue attribution task, which identifies the speaker of an utterance in a novel, is an essential task for the analysis of novels and their characters. In order to perform this task, it is necessary to attribute character mentions to utterances. This paper applies large language models to the task of determining whether the speaker of the utterance exists in sentences immediately preceding or subsequent to the utterance, and then divides the entire set of utterances into two groups. Among these, for the group of utterances whose speakers are judged to exist in sentences immediately preceding or subsequent to the utterances, it was shown that the large language models can perform the task of attributing character mentions to utterances with higher accuracy compared to the entire set of utterances.

    Download PDF (2219K)
  • Working Paper
    MIWA NISHINAKA
    Session ID: 1T4-OS-32b-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Immersion is said to correlate with “speculative thinking” in science fiction readers. This phenomenon facilitates acceptance of the idea that there may be a future previously unimagined. For this reason, it is extremely important for future envisioning methods that use stories to incorporate an empathic factor that generates immersion in the participants. Based on prior research, the current article explained the importance of incorporating empathic factors into the future envisioning method and examined the factors necessary for empathy toward future generations. Suggestions for future research to improve the method for generating better and more innovative future ideas were presented.

    Download PDF (360K)
  • Reo FUJIKI, Tomoya MINEGISI, Hirotaka OSAWA
    Session ID: 1T5-OS-32c-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In this study, we analyzed the relationship between the number and amount of statements made during discussions in SF prototyping and the magnitude of actions in the skit part. SF prototyping is a backcasting method in which future ideas are created through the discussion process of creating science fiction. It is believed that this method can generate original ideas and activate participants. However, there are still few indicators to measure the results and participants' immersion in SF prototyping. Therefore, this study analyzed participants' behavior during skits based on plots created after SF prototyping and discussions during SF prototyping. The results of the analysis showed that the more people said during the discussion, the smaller their movements during the skit.

    Download PDF (499K)
  • Kenichi INOUE, Kei SUZUKI, Midori SUGAYA
    Session ID: 2A1-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Recent years, many EEG-based machine learning methods have been studied to provide objective and accurate assistance in the diagnosis of depression. Many studies have proposed machine learning methods for binary classification of depression and healthy controls. However, there are issues that are not considered in binary classification. For example, the risk of suicide differs between mild and severe depression. In this study, we construct a classification model of depression severity by machine learning using an EEG dataset and discuss the results. An open dataset consisting of EEG data from 116 individuals was used as the analysis data. Depression severity in the dataset was classified into four categories: normal, mild, moderate, and severe. An index extracted from the alpha wave of electrode Fp2 was selected as a feature with reference to related studies. A four-class classification of depression severity was performed using a random forest. The classification accuracy was 62.01%.

    Download PDF (322K)
  • -Modeling Efficient Question and Intervention Selection through Multitask Reinforcement Learning-
    Taiga SANO, Kaori FUJIMURA, Tae SATO, Masami TAKAHASHI, Masahiro KOHJI ...
    Session ID: 2A1-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    To reduce the risk of lifestyle-related diseases among the Japanese population, there is a health guidance program to motivate people to improve their lifestyles through interviews conducted by public health nurses, dietitians, and other healthcare professionals. However, differences in the biological, psychological, and social characteristics of the interviewees make motivation difficult within a limited interview time, even for skilled interviewers. We are constructing an agent that can assist the interviewers by providing them with appropriate topics (questions) and suggesting lifestyle modifications that will best motivate the interviewee. Multitask reinforcement learning is used to avoid questions that would not suggest lifestyle improvement measures and to select only questions that are necessary for understanding the interviewee's characteristics and for suggesting improvement measures. To validate our initial version of the agent, we tested it in a simple simulation environment and confirmed that the proposed method is more effective than comparable methods.

    Download PDF (399K)
  • Naotoshi NAKAMURA, Hyeongki PARK, Masaharu TSUBOKURA, Shingo IWAMI
    Session ID: 2A1-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In the COVID-19 pandemic, vaccination was an important countermeasure. It is now known that after a person is vaccinated, antibody titers decline over time and efficacy declines, which can lead to breakthrough infections. Thus, it is important to identify immunocompromised populations with persistently low antibody titers in order to develop an effective vaccination strategy. In this study, we used a cohort in Fukushima prefecture vaccinated with COVID-19 vaccine and followed antibody titers over time. We reconstructed the individual-level dynamics of antibody titers using a mathematical model and identified individual variability. We also developed an antibody score that can predict individual antibody titers to some degree through simple calculations based on information such as underlying diseases and adverse reactions.

    Download PDF (243K)
  • Fusion of XAI and UQ with Surrogate model
    Yasuhiko MIYACHI, Osamu ISHII, Keijiro TORIGOE
    Session ID: 2A1-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Objectives: We propose the XAI and UQ (Uncertainty Quantification) for the Clinical Decision Support System (CDSS). The based CDSS was presented at JSAI2022. Method: The XAI and UQ use the "same" surrogate model (k-NN Surrogate model) based on the k-Nearest Neighbors. The XAI method is an Example-based Explanation. This model outputs information about the medical literature and diseases from instances of training data. The UQ method is Conformal Prediction. The Difficulty Estimator of this model outputs Difficulty scores. By "Processing to closest" of the surrogate model, the predicted data of the surrogate model are close to that of the main model. Conclusions: Our proposed XAI and UQ could be adapted for other CDSSs. Unlike current commercial LLMs, prediction, XAI, and UQ of our CDSS can provide evidence and uncertainty information to medical professionals.

    Download PDF (407K)
  • Ryoko FUKUDA, Fumio HARADA, Takeshi IWAIDA, Chie MATSUMOTO
    Session ID: 2A1-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    The “Majikami AI” system was developed to assist inexperienced caregivers in assessing and caring for care recipients based on the skills of expert caregivers called “Majikami”. A questionnaire survey was conducted to analyse the impact of Majikami AI on caregivers' attitudes and behaviors. As a result, caregivers who used Majikami AI frequently felt that they could use Majikami AI and daily care record data to formulate care plans and provide care. They were able to better understand the care recipients' condition from many aspects, so that they could be more aware of changes in the care recipients' condition and provide more appropriate care. In addition, some of the caregivers who frequently used Majikami AI felt that the care recipients were getting better.

    Download PDF (788K)
  • Shunki TOMIZAWA, Hidenori KAWAMURA, Tomohisa YAMASHITA, Soichiro YOKOY ...
    Session ID: 2A4-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Detailed information for comparisons between products is necessary in consumers’ product purchasing process, especially during the information search and choice evaluation phases. However, conventional product descriptions, which are the main source of information, tend to focus only on the product in question, and thus do not adequately express the differences between products. To solve this problem, garments are treated as target products, and a caption-generation method that emphasizes the differences between pairs of garment images using a deep-learning model for image caption-generation is proposed and its effectiveness verified. The proposed method selects and outputs captions that express differences in features from a set of captions generated for input-garment image pairs. Subject experiments confirmed that the proposed method accurately represented the feature differences between garments and provided useful information for consumers to compare garments. In particular, the proposed method is highly effective for garment pairs with similar features.

    Download PDF (290K)
  • Daiki INOUE, Tomonobu OZAKI
    Session ID: 2A4-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In this study, we report the initial experimental results on information extraction targeted at light novels. Specifically, annotations were performed on nine light novels to construct a foundational dataset. Subsequently, several types of data augmentation were applied, and named entity extraction was conducted using deep learning models. While the extraction accuracy achieved is not always sufficient, the experiments have revealed several insights into the handling of the unique writing style and vocabulary specific to light novels, which are pertinent to information extraction.

    Download PDF (256K)
  • Soichiro YOKOYAMA, Tomohisa YAMASHITA, Hidenori KAWAMURA
    Session ID: 2A4-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    A selection mechanism for Japanese haiku, world's smallest fixed form of poetry, is developed to select haiku of interest to the user from a set of haiku generated by a deep autoregressive model. This is achieved by training a deep model that estimates the probability of occurrence or similarity of the haiku to be selected by learning the user's previous haiku works. 100 million haiku are generated and selected using a large-scale language model that has additionally learned 400,000 haiku data. We additionally train a deep language model using several thousand haiku created by users in the past as training data, and decide which haiku to select from the acquired model. With the cooperation of haiku poets, we evaluated the effectiveness of the autoregressive model and the masked language model by presenting the selection results with different numbers of parameters. The experimental results revealed the high performance of the autoregressive model and the importance of using the ratio of the estimated results of the model trained only on the case data and the model trained on haiku in general, rather than simply selecting the haiku with the highest estimated probability of occurrence.

    Download PDF (289K)
  • QIN YANG, Hidenori WATANABE
    Session ID: 2A4-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    This study proposes a methodology for conducting science fiction (SF) prototyping workshops effectively without the need for professional storytellers, such as novelists. Our approach harnesses generative AI for collaborative storytelling support. This approach encompasses keyword-driven idea collection, AI-mediated storytelling, expert insight-based guiding frameworks, and evaluation processes for AI-generated outputs. Implemented in various workshops without professional storytellers, this approach markedly shortened the time to create a worldview that fairly reflects the participants' perspectives and efficiently enriched the details of narratives, enhancing the immersion into a fictional future world. By utilizing a critical attitude toward the output of the generative AI, the workshop maintained a fluid dialogue and provided a rich experience consistent with SF prototyping features such as free thinking. Results indicate that the proposed method has the potential to facilitate the spread of short, intensive workshops and general use for SF prototyping.

    Download PDF (755K)
  • Kazuhiro KUDO, Hirotaka OSAWA, Tomoya MINEGISHI
    Session ID: 2A4-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In this study, we conducted a workshop of Sci-Fi prototyping and analyzed the participants' speech. Sci-Fi prototyping is a workshop method to create new ideas as a vision of the future by encouraging participants to speak to each other by using the ideas they have when creating SF for discussion. In order to make the discussion effective, it is important to analyze the discussion process and the interaction of the participants. Therefore, we conducted a workshop using a web application developed to record the contents and time of speech, and analyzed the data. As a result, it became clear that the difference in the amount of speech between different grades became smaller as the discussion progressed.

    Download PDF (407K)
  • Masashi YANAMAWAKI, Yutaka UTSUNOMIYA, Yuichi HIRAMATSU, Hironori KIMU ...
    Session ID: 2A5-GS-10-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, efforts are underway in cities around the world to create walkable street spaces that are comfortable and inviting to walk around. In this effort, it is important to investigate user action to understand the formation of a comfortable street space. However, performing these tasks manually is burdensome in terms of both time and cost, and is likely to result in a decline in quality. In this research, we are developing a human action detection technology using deep learning, with the aim of increasing the efficiency and sophistication of work. There are already many research developments and operational examples of action detection AI models. However, we cannot confirm anything that is specialized for this purpose and can be applied immediately. In this paper, we developed a model that detects various actions listed in the guidelines prescribed by MLIT using camera images taken of street spaces. Then, we attempted to apply this model to understanding the usage situation of actual street spaces. As a result, we were able to quantitatively understand the usage status of actual street spaces on holidays and weekdays, demonstrating the potential for effective use of this model. Based on the results, we also considered how to combine AI and digital twins for building 3D models of real street spaces and solving problems.

    Download PDF (2226K)
  • Takahiro SUZUKI, Kengo OKANO, Hideki FUJII, Daisuke OKUYA
    Session ID: 2A5-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We aim to realize smooth traffic flow to solve social issues in transportation, such as traffic congestion and accidents. We can verify traffic measures virtually to achieve smooth traffic flow by using traffic flow simulation. The virtual verification allows making decision on traffic measures efficiently and cost-effectively. On the other hand, the precision of the simulation must be high for using simulation effectively. Then it is important to set appropriate parameters for the simulation model. However, appropriate parameters are rarely known in advance, and they have often been determined by trial and error. In this study, we statistically estimate the parameters using traffic probe data through a data assimilation method called the ensemble Kalman filter. We determined the parameters for testing from the estimated parameters of each date. In the results of the Traffic flow simulations with the determined parameters, they reproduced traffic flow close to actual one.

    Download PDF (1344K)
  • Yanbo PANG, Kunyi ZHANG, Yurong ZHANG, Yoshihide SEKIMOTO
    Session ID: 2A5-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, with the advancement of smart cities and smart mobility, there has been an increasing need to efficiently and cost-effectively understand people's behaviors in detail, despite the constraints imposed by the protection of personal information. This study aims to effectively learn and predict the spatial patterns of people's daily activities by utilizing trip survey data collected from approximately 6 million individuals across 20 urban areas in Japan, employing large-scale language models. We adopt the Transformer architecture, which has demonstrated high performance in various tasks, and leverage unsupervised learning methods for robust transferability. This approach aims to integrate demographic information with trip data to generate more comprehensive and accurate synthetic human flow data. Furthermore, we explore how to utilize census-based behavioral data for the development of human flow models in developing countries, thereby opening new possibilities in this field."

    Download PDF (857K)
  • Naoya NOMOTO, Takahiro SUZUKI, Daisuke OKUYA
    Session ID: 2A5-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    We are researching traffic flow simulation using probe and traffic counter data to mitigate congestion and reduce accidents. Forecasting traffic volume is essential for this. Traditional methods like the seasonal adjustment model can predict hourly traffic volumes. However, for finer simulations, forecasting is required at shorter intervals. This becomes computationally challenging due to increased complexity in parameter adjustment. Our solution involves dividing the traffic volume data for training and forecasting, enabling predictions at shorter intervals while reducing computational complexity. The traffic flow simulations done using the predicted volumes were efficient in reproducing actual traffic flows.

    Download PDF (836K)
  • Yuta TAKAHASHI
    Session ID: 2A5-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Data in the civil engineering field is often language-based, and many drawings and other materials are only image data and have not been digitized. There is a lack of digital human resources to process these data, and the amount of data is enormous. This study verified the feasibility of using chatbots based on large-scale language models to process these data. ChatGPT can analyze the different format data in xROAD with some process, and the feasibility for data format robust processing tool is appeared.

    Download PDF (1766K)
  • Syota YANACHI, Hayato OKADA, Keisuke YONEDA, Daisaburo YOSHIOKA
    Session ID: 2A6-GS-10-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In this report, we evaluate classification tasks for inappropriate sentences in SNS using ChatGPT. It is shown that only prompts including Zero-shot CoT and Few-shot learning do not result in high performance compared with a fine-tuned BERT model. Our result also shows that a fine-tuned ChatGPT achieves an accuracy of over 90%, which is comparable in performance to BERT.

    Download PDF (461K)
  • Takumi IRIE, Kouki NOGUCHI, Tatsuki KAWAMOTO, Taiga MATSUI, Naoto MORI ...
    Session ID: 2A6-GS-10-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In online personal styling, stylists provide fashion styling based on the client’s characteristics, purpose of use, season, and various other factors. Known client's information about styling might not be enough, or it might be interpreted in various ways, and stylists have to infer the client’s essential needs. To solve this matter, we construct interactive fashion styling system using ChatGPT. The objective of our system is to resolve these matters in client's information, obtain the client’s agreement for styling direction, and enable stylists to provide more appropriate styling for clients. We use two ChatGPTs for two types of tasks; one for dialogue management and the other for response generation. In addition, we use Retrieval-Augmented Generation (RAG) to introduce fashion knowledge into ChatGPT. In this way, the system realized high degree of flexibility and adaptability. We confirmed that the system using ChatGPT is superior to the conventional method in quantitative evaluations.

    Download PDF (552K)
  • Shojiro TSUTSUI, Michihiro KARINO, Kenichi KUROKI, Aya FUKUMOTO, Yusuk ...
    Session ID: 2A6-GS-10-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In Japan, non-life insurance companies deliver products through agencies. Major insurance companies provide support through phone calls, emails, etc., at locations nationwide to ensure that their tens of thousands of agents can accurately handle customers, taking into account the characteristics and underwriting rules of a wide variety of insurance products. The documents to be referred to cover a vast amount of complex rules, and as financial products, precise and courteous responses are always needed according to individual cases. In this study, we developed and operated an inquiry response support system using the RAG architecture of LLMs with the aim of improving the inquiry response operations of casualty insurance companies. In addition, we conducted evaluation experiments on the optimal combinations of conditions related to response performance, such as the chunk division units of the target manuals for searching and the number of tokens input into the LLM.

    Download PDF (938K)
  • Kohei ABE, Soichiro YOKOYAMA, Tomohisa YAMASHITA, Hidenori KAWAMURA
    Session ID: 2A6-GS-10-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Detailed information for comparisons between products is necessary in consumers’ product purchasing process, especially during the information search and choice evaluation phases. However, conventional product descriptions, which are the main source of information, tend to focus only on the product in question, and thus do not adequately express the differences between products. To solve this problem, garments are treated as target products, and a caption-generation method that emphasizes the differences between pairs of garment images using a deep-learning model for image caption-generation is proposed and its effectiveness verified. The proposed method selects and outputs captions that express differences in features from a set of captions generated for input-garment image pairs. Subject experiments confirmed that the proposed method accurately represented the feature differences between garments and provided useful information for consumers to compare garments. In particular, the proposed method is highly effective for garment pairs with similar features.

    Download PDF (897K)
  • Noriko OGASAWARA, Ayuno FUCHI, Ayako YAMAGIWA, Masayuki GOTO
    Session ID: 2B1-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, many methods have been studied to analyze large amounts of review data posted by users in order to improve products and services. Most of these methods are mainly studies that attempt to extract some kind of information from user reviews using various machine learning techniques, and most of them do not provide an analysis method that clearly prioritizes quality improvement by considering the characteristics of each factor and the customer's viewpoint. In contrast, studies focusing on quality improvement have proposed a review analysis that introduces the concept of the Kano model, which classifies quality factors into "attractive quality" and other categories based on changes in customer satisfaction depending on their presence or absence. However, these studies have the problem that they cannot analyze at the word level or focus on only some words. In this study, we propose a method to analyze all words in a review sentence and classify quality factors. Specifically, by combining LDA learned using review sentences and logistic regression, we quantify the influence of each word that appears in the review sentences on the evaluation value. Through analysis of real data, we show that the proposed method can be used for detailed word-level analysis and heuristic quality factor extraction.

    Download PDF (710K)
  • Kengo MIYAJIMA, Yuto NUNOME, Yuta SAKAI, Masayuki GOTO
    Session ID: 2B1-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Multi-label classification in document data is the task of correctly assigning multiple class labels to each document. However, there is often a semantic hierarchical structure among the assigned labels, and considering the hierarchical structure can improve the accuracy of label prediction. The Multi-label Box Model (MBM) has been proposed as a multi-label classification model that takes into account the semantic hierarchical structure among labels, and its effectiveness has been demonstrated when class labels of all layers are assigned to training data. However, real-world document data posted on user-contributed websites often lack class labels for all layers of the hierarchy. If such data is used to train MBM, the accuracy of label prediction is reduced. In this study, we propose a framework for learning MBM after complementing labels of missing hierarchies by introducing Bidirectional Encoder Representations from Transformers (BERT). The effectiveness of the proposed method is also demonstrated through evaluation experiments, which compare the accuracy of the conventional method and the proposed method when applied to data with missing labels of some hierarchies.

    Download PDF (475K)
  • Daiki FUJIWARA, Takuya MORIKAWA, Ayako YAMAGIWA, Masayuki GOTO
    Session ID: 2B1-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In recent years, e-commerce sites have been used to conduct customer analysis, with the aim of building good relationships with customers and improving long-term sales. Evaluating the diversity of customers' purchasing behavior is particularly important as a marketing angle. The scalar values assigned to each customer in studies analyzing purchasing behavior diversity do not consider the varying impact of individual purchase items on the indices. Therefore, customers who should be treated with different business measures may be treated as the same. This study proposes a method for customer analysis that considers the impact of each purchase item on the index. The method calculates features that represent the diversity of each customer's purchase behavior by utilizing the distribution of weights assigned to each purchase in the Knowledge Graph Attention Network, a type of recommendation model. The effectiveness of the proposed method is demonstrated by applying it to real data.

    Download PDF (649K)
  • Ayako YAMAGIWA, Masayuki GOTO
    Session ID: 2B1-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In creating product maps, which are effective as a method of product analysis, the authors have shown that using a pairwise comparison DNN to estimate human evaluations significantly improves the efficiency of map creation relative to the number of data acquisitions compared to conventional creation methods. Specifically, assuming that the ratio of the evaluation values of two product image data to be compared can be obtained in a one-to-one comparison, a deep learning model can be trained using a realistic number of comparison results, and the estimated values can be used to estimate the evaluation values for an arbitrary evaluation axis for all product images. Here, subjective axes such as "cute" and "gorgeous" can be considered appropriate for the product map to be analyzed. When such personal axes are used, differences due to the subjectivity of the evaluators are expected to occur in the product image evaluation values. Therefore, in this study, we analyze how the subjectivity of the evaluators affects the estimation results of the DNN model by using a pair-wise comparison between multiple evaluators. Furthermore, we show that it is possible to analyze the similarity of evaluators' preferences by using the parameters of the deep learning model for pair-wise comparison.

    Download PDF (666K)
  • Yusuke NISHIYAMA, Le Binh THANH, Hiroyuki DAN, Yutaro HAYASHI, Akira M ...
    Session ID: 2B1-GS-2-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    AUCNET INC. provides a valuation support service for used car dealers across Japan, leveraging machine learning to predict future auction sale prices. While our previous research significantly enhanced overall prediction accuracy, we found the problem that the prediction accuracy remains low for specific car models. To address this issue, this study introduces a novel approach that segments data into three distinct vehicle categories and builds individual models for each group. Upon comparison with our previous model, our new models improve the prediction accuracy when the partitioned dataset is relatively large.

    Download PDF (754K)
  • Akane TSUBOYA, Tatsuji TAKAHASHI, Yu KONO
    Session ID: 2B5-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Deep Reinforcement Learning (DRL) has shown performance that equals or even surpasses humans in games such as Go and video games. However, the learning process for DRL agents requires a large amount of data, which means there is room for improvement in their exploration efficiency. Quick achievement of an aspiration level of performance is also an important goal, especially in industrial applications. Focusing on human cognitive characteristics, a tendency that prioritizes goal achievement, we have incorporated a method called Regional Stochastic Risk-sensitive Satisficing (RS2) into DRL. RS2 can calculate the agent's future exploration distribution drawing on reliability, a value that denotes the number of times that an action has been selected. However, in complex environments, counting the number of selections accurately is hard. This necessitates approximation of reliability through multiclass classification. We applied in this paper a method called Random Network Distillation (RND) to reliability. RND utilizes the prediction error of state transitions as a reward bonus for the agent's intrinsic motivation. This method has a problem that the agent's aspiration level of expected return changes. In this study, we overcame this problem through using RND indirectly for estimating reliability and combining it with RS2, and improved performance without changing the expected return.

    Download PDF (813K)
  • Kazuki Takahashi TAKAHASHI, Tomoki FUKAI, Yutaka SAKAI, Takashi Takaha ...
    Session ID: 2B5-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In incomplete information games, it is difficult to predict the opponent's strategy, there has been a lot of research on finding a Nash equilibrium, which is a strategy that is easy to win independent of the opponent's strategy. Poker, which has a huge observable value space of 1016, uses Deep Neural Networks (DNNs) to find Nash equilibrium strategies and has achieved performance superior to that of humans. On the other hand, it is difficult to explain the appropriateness of the selected action in terms of the complex state space. In this study, we propose a Bayesian model that reduces a huge observation space to a concise state space and evaluates its performance using the incomplete information game "Vulture Culture" as a subject. As a result, the proposed method reduces an observation space of about 104 to a near-optimal state space. It is also shown that the appropriate state space reduction facilitates the prediction of the opponent's strategy and improves the learning speed of the optimal strategy.

    Download PDF (1314K)
  • Yosuke NISHIMOTO, Takashi MATSUBARA
    Session ID: 2B5-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    World models mimic observed dynamics to aid learning complex behaviors. However, in situations such as playing games, where different dynamics with distinct characteristics coexist within the same screen, effective learning of world models becomes challenging. This challenge has been identified in tasks like video prediction, and recent efforts have explored solutions using object-centric representations. In this paper, we present transformer-based world models with object-centric representations combining world models with a method for video prediction using object-centric representations. This approach uses object features to model spatiotemporal relationships and predict future states accurately based on actions. The transformer receives multiple latent states from object-centric representations, rewards, and actions, flexibly adapting to all modalities across different time steps. It is expected to distinguish dynamics with distinct characteristics for each object, predicting accurate future states in response to actions. We validated the effectiveness of our method using Atari 100k benchmark's Boxing, demonstrating its utility.

    Download PDF (579K)
  • Kota MINOSHIMA, Sachiyo ARAI
    Session ID: 2B5-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    In order to acquire an appropriate control law through reinforcement learning, it is necessary to design an appropriate reward function. However, this reward design becomes complicated for large-scale problems, increasing the design burden and inducing unintended behavior. Therefore, when unintended behavior is observed in real-world applications of reinforcement learning In real-world applications of reinforcement learning, when unintended behavior is identified, a method to improve the reward design based on this behavior may be required. In order to identify the cause of unintended behavior, it is necessary to know what kind of reward the agent is getting by the current reward function. One approach to this is inverse reinforcement learning, which estimates the expert's reward given the expert's trajectory. By applying inverse reinforcement learning to the trajectory of a reinforcement learning agent, it is possible to know what kind of reward the agent is getting according to the current reward function. In this study, we propose a method to improve the performance of reinforcement learning by estimating the reward of the reinforcement learning agent by inverse reinforcement learning and improving the reward design based on the estimated reward.

    Download PDF (299K)
  • Takumi SAIKI, Sachiyo ARAI
    Session ID: 2B5-GS-2-05
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Traffic signals are the primary method for facilitating vehicular traffic, but advances in automotive technology have increased the need for new methods. Deep reinforcement learning has recently attracted attention as an optimization method for traffic signals, and various methods have been proposed. Most conventional methods focus only on traffic signals and assume that traffic flow is stationary regardless of the strategy. However, in an environment where congestion changes due to policy changes, vehicles may change course to less congested roads, resulting in changes in traffic flow beyond those predicted during learning. Technological developments related to connected cars will enable optimal routing according to various traffic conditions, and changes in signal programs are expected to affect multiple traffic flows. Therefore, it is necessary to develop signal control methods that consider changes in traffic flow due to signal optimization and signal programs. However, considering route optimization for each vehicle would increase the number of agents and complicate the computation. In this study, we first formulate vehicle agents by a mean-field approximation. Then, we propose a method for stepwise optimization of traffic flow in an environment where vehicles and traffic signals learn in both directions.

    Download PDF (311K)
  • Shuichi MIYAZAWA, Daichi MOCHIHASHI
    Session ID: 2B6-GS-2-01
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Ordinary differential equations (ODEs) help the interpretation of phenomena in various scientific fields. ODEs are often applied to numerical data, but we proposed a modeling method using ODEs for sequences of events occurring in continuous time (temporal point processes) [Miyazawa 23]. Here, event series with labels indicating the type of components of the nonlinear dynamical system described by the ODEs are required, but in real settings, there are many event series that do not have such labels explicitly. Real event data is often accompanied by covariates, e.g., abstracts of inventions in patent applications. Such additional information, called marks, is useful for identifying latent event types. Therefore, we propose a method for modeling the generating process of event series by ODEs, using marks to estimate latent event types for event series without explicit labels indicating the components of the ODEs. The proposed method can be considered as an extension of latent Poisson process allocation [Lloyd 16], where each event is assigned to one of a set of latent Poisson processes, using ODEs. We demonstrated that the proposed method can estimate and recover latent event types and parameters of ODE using simulated data, and showed the applicability of the proposed method to a real problem using the USPTO patent dataset.

    Download PDF (674K)
  • Yuta AISHIMA, Kazushi IKEDA
    Session ID: 2B6-GS-2-02
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Denoising autoencoders learn the score of the data-generating distribution, i.e., ∇log p(x). However, theoretical studies have discussed only the case when the corruption process is Gaussian. To generalize the previous results, in this study, we extend the class of distributions to an exponential family. By using Tweedie’s formula, we show that the generalized DAE learns score of marginal distribution q(˜x) = ∫ q(˜x|x)p(x)dx, while Gaussian DAE learns the score of the data-generating distribution p(x). Then, we focus on the encoder of denoisng autoenoders and investigate what this encoder learn. Numerical experiment shows that this encoder extract features which relate the shape of the data-generating distribution.

    Download PDF (306K)
  • Kazuma ONISHI, Katsuhiko HAYASHI
    Session ID: 2B6-GS-2-03
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Extreme multi-label learning (XML) is a task of assigning multiple labels from an extremely large set of labels to each data instance. Many of the current high performance models for XML are composed of a lot of hyperparameters which causes issues with reproducibility. Additionally, the models themselves are adapted specifically to XML, which complicates their reimplementation. To remedy this problem, we propose a simple method based on ridge regression for XML. The proposed method not only has a closed-form solution but also is composed of a single hyperparameter. Since there are no precedents on applying ridge regression to XML, this paper verified the performance of the method by using various XML benchmark datasets. Experimental results revealed that it can achieve levels of performance comparable to, or even exceeding, those of models with numerous hyperparameters.

    Download PDF (383K)
  • Yudai YAMAMOTO, Satoshi HARA
    Session ID: 2B6-GS-2-04
    Published: 2024
    Released on J-STAGE: June 11, 2024
    CONFERENCE PROCEEDINGS FREE ACCESS

    Ensuring fairness is essential when implementing machine learning models to practical use. However, recent research has revealed that one can craft a benchmark dataset as a fake evidence of fairness from unfair models. The existing method, Stealthily Biased Sampling, solves a minimization of Wasserstein distance, which is computationally challenging when applied to large datasets. In this study, we formulate Stealthily Biased Sampling as the minimization of Sliced Wasserstein distance, demonstrating its feasibility for efficient computations.

    Download PDF (906K)
feedback
Top