-
Rikisuke ICHII, Mitsunori MATSUSHITA, Hirofumi HORI
Session ID: 1M5-GS-10-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Movement analysis is an essential examination in planning physiotherapy. During this examination, the physiotherapist evaluates the patient's condition based on their experience and describes the test results as text. The text describing the examination outcome contains practical knowledge based on experience. Extracting and relating such knowledge to each other will help ensure objectivity and share best practices, thereby assisting physiotherapists in conducting movement analysis. However, these texts often contain linguistic and semantic ambiguities, making it difficult to extract knowledge uniformly by computer. A previous study defined the smallest unit of knowledge that constitutes physiotherapy (PBPU; Problem-Based Physiotherapy Unit). It contributes to organizing the movement analysis texts logically; however, extracting PBPUs from the text requires enormous work time. In this study, we attempted to extract PBPUs using a rule-based method to address the issue. As a result, we demonstrated that about half of the PBPUs can be automatically extracted.
View full abstract
-
Takayuki OGASAWARA, Yoshitaka WADA, Masahiko MUKAINO, Eiich SAITOH, Sh ...
Session ID: 1M5-GS-10-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This study aimed to clarify that the body function of stroke inpatients in the future could be predicted from activity data using a consumer-grade wearable device. Experiments began within one week of admission, and heart rate and acceleration were continuously obtained on the chest for 48 hours. We predicted the Functional Independence Measure (FIM) as rehabilitation outcome. Random forest was used as the predictor. Results with 5-fold cross-validation showed that the coefficient of determinations between the predicted and actual values of FIM based on activity data were 0.74 (n=1196) in the week that the measurement was conducted, 0.81 (n=850) in two weeks, and 0.79 (n=394) in nine weeks, which corresponds to the typical period of discharge. All results were statistically significant (p < 0.001). These results suggest the predictability of clinical indicators from activity data from the early stages of hospitalization to discharge.
View full abstract
-
Taro TOKUI, Yuya OSAKI, Yukie NAGANO, Daiki TAKAMURA, Yuichi NAGAOKA, ...
Session ID: 1M5-GS-10-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Eating is not only important for maintaining health, but also a source of pleasure and social interaction in daily life. However, visually impaired people may experience anxiety and fear due to not being able to identify the location and shape of food, resulting in a decreased enjoyment of meals or even avoidance of dining occasions. While aids such as white canes and braille, as well as smartphones, have been developed to assist visually impaired people with mobility, reading, and writing, the aids available for mealtime remain to merely adjustments in dish placement, color, and shape.Furthermore, many visually impaired people prefer to eat independently without assistance from others, highlighting the need for systems that enable independent dining experiences. This study aims to identify the necessary functions for a system that can assist visually impaired people in eating independently, using a prototype system to evaluate these functions. The prototype system utilizes a depth camera to measure the quantity and positioning of food, and was evaluated by sighted participants who were blindfolded. The results of this study aim to contribute to the development of a system that can support visually impaired people in eating independently.
View full abstract
-
Kenji HORIKOSHI, Yuji AYATSUKA, Tsutomu YASUKAWA
Session ID: 1N3-GS-10-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Machine learning has been shown to accurately estimate true age from fundus images. However, it is currently unclear which features the machine learning model uses to make these determinations, and which parts of the image are most clinically relevant for age estimation is also unknown. While methods such as Grad-CAM and DiDA can be used to interpret where the machine learning model is making inferences, most studies have focused on object detection and classification, with few investigating regression problems such as age estimation. In this paper, we applied Grad-CAM and DiDA to age estimation from fundus images and investigated where the machine learning model made age estimates. We found a common response of Grad-CAM and DiDA in approximately 80% of the images, with areas that were masked by DiDA showing lower estimated ages. This suggests that the machine learning model considers these areas as important factors in age estimation, and that they contribute to higher true age estimates.
View full abstract
-
Takeo SHIBANO
Session ID: 1N3-GS-10-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Since there is a wide variety of evaluation items for automotive plastic parts and the evaluations for such parts require a large amount of time, alternative materials are not being adopted quickly enough in case of emergencies. Among various testing items, fatigue testing is particularly time-consuming and frequently required. This paper proposes a method for predicting fatigue limits of polymer composites by using machine learning. In this study, we employed an ensemble method of decision tree such as random forest, XGBoost and Light GBM regression. According to the domain knowledge about polymer science, we suggest that the most appropriate method is XGBoost for this dataset. As a result, we established a versatile prediction model of fatigue limits of polymer composites which coefficient of determination is 0.803 even for polymer composites from material manufacturers not used for constructing the prediction model.
View full abstract
-
Satoki FUJITA, Yuki YOSHIDA, Yoshitake KITANISHI
Session ID: 1N3-GS-10-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In terms of health management for employees, it is necessary to implement appropriate preventive measures against disorders that significantly affect presenteeism such as depression. However, although these disorders are known to be closely related to daily lifestyle, the causal structure itself is still unclear. In order to solve this problem, health checkup data can be used to clarify the dependence of each lifestyle factor in a data-driven manner, which may lead to the examination of effective measures. As a preliminary study for the above purpose, we focus on causal search methods (Bayesian Network, LiNGAM). Through the results of application to the health checkup database, we compared and examined the characteristics and usefulness of these methods from a practical point of view. It was confirmed that these methods can help achieve the above purposes if we recognize the shortcomings of the methods and do not misuse them.
View full abstract
-
Issey SUKEDA
Session ID: 1N3-GS-10-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Automatic diagnosis has been extensively studied in the medical field. In particular, the classification problem is one of the typical approaches. The application of supervised machine learning to heart disease detection using electrocardiogram (ECG) data has also been studied extensively in recent years, with neural-network-based models recording high accuracy. However, annotation cost and label imbalance are often challenges in these studies. Espe- cially in the medical domain, labeling requires expertise, and positive examples are often extremely rare compared to negative examples, making it difficult to prepare high-quality data on a large scale. Data augmentation methods can be effective in addressing these issues. Data augmentation is the process of creating artificial data by performing certain operations, such as perturbation, on the original data/label pairs. In this study, data augmentation is used to improve the performance in detecting low left ventricular ejection fraction (LVEF). Although data augmentation has been popular in the field of image processing, it is still in its developing state for time series data. In this paper, we report on the effectiveness of a method that combines multiple data using optimal transport in the frequency domain of the ECG to obtain augmented samples.
View full abstract
-
Idehara NAOYA, Kimura GENKI, Nakajima TSUBASA, Sakai YASUHIRO, Sasajim ...
Session ID: 1N3-GS-10-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In 2022, 28.9% of Japanese population is aged 65 or older. On the other hand, smartphone ownership among the elderly has been increasing in recent years, with 74.2% of those aged 65 and over owning smartphones. Against this backdrop, digitalization of various events using smartphones is progressing, but there are few examples of digitalization in arts events of high interest to the elderly. In this study, with the cooperation of the Hyogo Performing Arts Center, we will digitize events in the arts and examine what kind of behavioral changes can be observed among participants. If the effectiveness of digitization is confirmed, we intend to increase the number of members by inducing them to register as members, and to change their behavior in order to expand arts and culture among the citizens of Hyogo Prefecture. This paper introduces a case study of digitization of an art event conducted in November 2022. In cooperation with the Hyogo Performing Arts Center, we conducted a digital stamp rally that could be completed only with a smartphone on Premium Arts Day, an event to introduce arts and culture to prefectural residents, and collected data through a questionnaire. The results of the analysis showed that elderly people participated in the digital stamp rally, confirming the possibility of attracting interest in digital events.
View full abstract
-
Takashi HATTORI, Hiroshi SAWADA, Sanae FUJITA, Tessei KOBAYASHI, Koji ...
Session ID: 1N4-GS-10-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We propose a novel problem recommender system that can suggest moderately challenging problems to learners. By training a Variational AutoEncoder to reconstruct problem-answer data with a small number of latent variables, we can predict the likelihood of a learner's ability to correctly solve unanswered problems. Experimental results showed that the system's predictions were accurate for learners who had solved a sufficient number of problems, even for a wide variety of problems, and that the system was able to recommend problems of moderate difficulty for individual learners.
View full abstract
-
Nobuyuki HIROSE, Shun SHIRAMATSU, Shun OKUHARA
Session ID: 1N4-GS-10-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We propose a method for automatically generating advice to support the writing of learning plans and reflections. When the quality of the training data is matched, the automatically generated advice tends to be uniform, but at the same time we want to achieve a response that is in line with the descriptions of individual learners. Therefore, the GPT-3 was retrained using the rubric and teacher's composition training data. The automatically generated advice had a ROUGE-1 of 0.405 and a Cosine similarity of 0.906 with the learner's description, and a Cosine similarity of 0.872 with the rubric-aligned canned text. These results approximated those of the teacher's compositions, suggesting that the advice was generated with a balance similar to that of the teacher's compositions. For the cosine similarity, Embedding of GPT-3 was used. However, some failures were observed. Future issues are to be verified by qualitative evaluation.
View full abstract
-
A Case Study of PBL Exercise for First-Year Students in Academic Year 2022
Munehiko SASAJIMA, Ken ISHIBASHI, Takehiro YAMAMOTO, Naoki KATOH
Session ID: 1N4-GS-10-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This is the fourth year that the Faculty of Social Information Science at the University of Hyogo has conducted Problem-Based Learning (PBL) for first- and second-year undergraduate students, using real data from partner companies, with the goal of fostering data scientists with practical skills. The data scientists that our department aims to nurture are not only those who have the ability to analyze data using IT skills and knowledge of statistics, but also those who have the ability to formulate real-world problems and collect necessary data, and those who have the ability to make proposals for improving society by using the results of analysis. The goal of this program is to develop students' interest in improving management and to teach them the importance of thinking not only about data, but also about the actual situation. Since the establishment of the department in 2019, we have conducted PBL for first-year students four times, and this year, as a new trial, we limited the number of stores that would be the target of practical training to one. The exercise was evaluated by taking a student questionnaire afterward. This paper outlines the PBL exercises conducted in FY2022 and describes the advantages and challenges of PBL using real data, which were obtained through the PBL exercises conducted so far.
View full abstract
-
Ryo HAGA, Hidekatsu ITO, Masaki ISHII, Kohji DOHSAKA
Session ID: 1N4-GS-10-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Automated short answer scoring is the task of estimating the score of a short text answer written in response to a given prompt, based on pre-established rubrics. This research aims to automatically score written Japanese answers, utilizing a limited amount of training data while maintaining the scoring accuracy of previous research. We consider a scenario in which a mere 50 human-scored answers and 200 unscored answers are available as a limited amount of training data. Our proposed method fine-tunes the GPT-2 model with 200 unscored answers, generates answers using the fine-tuned model, and scores them using an analytic score prediction model trained on human-scored answers. A scoring model is then constructed using the scored generated answers. Our experimental results demonstrated that the scoring model could not attain the same level of scoring accuracy as previous research using 50 human-scored answers and 200 unscored answers. Nevertheless, we discovered that the model's performance improved as the number of scored generated answers increased. Unlike previous research that trained the scoring model using holistic scores and annotations in addition to analytic scores, we used only analytic scores. This indicates that the scoring accuracy of previous research can be achieved using a small amount of training data, by employing annotations and holistic scores.
View full abstract
-
Genta ITO, Kyohei NISHIMURA, Yuki NISHIMURA, Yoshitake KITANISHI
Session ID: 1N4-GS-10-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Speech-language-hearing therapists sometimes provide therapy for children with speech and hearing difficulties to improve their communication skills. However, there can be problems such as the need to attend therapy sessions at an institution. The aim of this study is to explore an objective metrics that reflects the degree of concentration and interest during online communication, using videos. A total of 57 videos during online communication obtained from 19 children were input to a facial expression recognition AI model, and the proportion of emotions in each video was output. Additionally, we conducted a survey to assess the children's concentration and interest levels during communication. We compared the results of the AI model analysis and the results of the survey. The results showed a weak to moderate correlation between the proportion of time recognized as "Happiness" by the AI model and the proportion of time when faces were not detected, with the children's degree of concentration and interest. It is suggested that these outputs from the AI model can serve as objective metrics for measuring the degree of concentration and interest during online communication for children.
View full abstract
-
Kazuki YAMAJI, Masane FUCHI, Tomohiro TAKAGI, So TAKAHASHI, Yukihiko H ...
Session ID: 1N5-GS-10-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In determining facade design, designers spend a great deal of effort considering multiple ideas. On the other hand, image generation technologies have made significant progress in recent years, and especially SDEdit can generate high-quality and creative images from illustrations. However, this method makes it difficult to connect the relationship between colors and text within an illustration, making it difficult to generate and edit facade design images composed of various materials as intended. Therefore, we propose a facade design generation and editing method that links the relationship between color and text. Specifically, the Attention mechanism used in diffusion model changes the importance of words depending on the colors in the illustration, and switches the reference text depending on the editing point. From the verification, it was indicated that the proposed method reflects the pre-specified color and text information, and can generate and edit images more in line with the user’s intentions.
View full abstract
-
Masane FUCHI, Kazuki YAMAJI, Tasuku OGASAWARA, So TAKAHASHI, Yukihiko ...
Session ID: 1N5-GS-10-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The process of facade design by designers involves creating a base plan for their own design to begin with, and then sublimating it by incorporating the client’s request and other factors. Current image editing methods are mostly GAN-based, in which latent representations are acquired and edited using an inversion model. In recently, however, research using Diffusion Models has come to the fore. Specifically, Latent Diffusion Model (or Stable Diffusion based on it), which targets latent space, is easy to handle and accurate. In this paper, we proposed a system that uses Stable Diffusion together with Attention-based Layout conditioning and CLIP’s output conversion model to achieve detailed and target image editing with only simple prompts. Through the evaluation by actual designers, we confirmed the usefulness of the proposed system and future work.
View full abstract
-
Noriko OTANI, Sou HIRAWATA, Daisuke OKABE
Session ID: 1N5-GS-10-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The taste of coffee is greatly influenced by the type and quality of the beans, the grind, the water temperature, as well as the extraction method. The drip method, which is the most common, can be challenging for novice coffee brewers to manage steaming time and pouring hot water. Music has various effects such as evoking emotions, creating atmosphere, and stimulating the brain. There are many pieces of music designed for specific effects, but none designed for coffee brewing. The impressions and feelings that arise from listening to music vary from person to person. Music that is distributed to the general public may not be effective for each individual. In this study, we aim to enable even novice coffee brewers to enjoy brewing delicious coffee, and propose a method for generating music to be played during coffee brewing based on symbiotic evolution. Experimental results show that the music generated by the proposed method may be able to induce the ideal coffee brewing process.
View full abstract
-
Takahiro HIGASA, Takashi KAWAMURA, Satoshi KURIHARA
Session ID: 1N5-GS-10-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
With the rapid development of infrastructure that enables the generation and distribution of various content, the demand for new content from consumers is increasing, and the burden on creative work such as supplying new stories and scenarios is increasing rapidly. Therefore, support from AI for human creative work is expected. In machine learning-based story generation, which is currently widely used, while fluent sentences can be generated, it is easy to generate common sentences, there is a risk of repetition when the sentences become longer, and it is difficult to explicitly model and control the story structure. Therefore, in this research, we focused on the plot, which is a story with a consistent structure while having creativity, and tried to automatically generate it. Furthermore, by utilizing crowdsourcing to evaluate the plot, it was found that the consistency was maintained more strictly depending on the structure, and the evaluation was higher than that of the plot generated by the large language model in terms of the ups and downs of the story and the interestingness. In addition, the relationship between which plot and which person feels creativity was studied.
View full abstract
-
Itsuki EBI, Shun SHIRAMATSU
Session ID: 1N5-GS-10-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We aim to develop a GPT-3-based method for generating songs with the atmosphere and concept intended by the user. Although it has been possible to generate song lyrics by giving a fictitious artist name and song title to GPT-3, a method to control the mood and concept of the generated lyrics according to the user's intention has not been established. In this study, we fine-tune the pre-trained model of GPT-3 to generate lyrics and chord progression with the mood and concept of the song as intended by the user by inputting keywords and explanatory text that represent the mood and concept. Furthermore, as a concrete case study, we verify whether the proposed method can generate songs that reflect the characteristics of local regions in order to apply it to regional development.
View full abstract
-
Yuki SHIMANO, Yuya KUWANO, Masaki TAKAHASHI, Masaru MIYAZAKI, Masanori ...
Session ID: 1O3-GS-7-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Audio descriptions enable a visually impaired audience to enjoy broadcast programs by providing supplementary information such as a person’s actions and facial expressions that are difficult for such audiences to understand from the main audio content. Although such descriptions would be ideal for the live sporting broadcasts, the production of audio descriptions for such events requires high production costs and expert commentary skills. We thus developed a system that creates audio descriptions of live baseball broadcasts and distributes them to users' smartphones in real time.These audio descriptions are created from the superimposed captions of baseball broadcasts automatically by using image recognition.The experimental results indicate that the proposed method recognizes information of superimposed captions and robustly produces audio descriptions in real time.
View full abstract
-
Siyun HUANG, Ken TSUTSUGUCHI
Session ID: 1O3-GS-7-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The purpose of this study is to develop an educational application for children to identify recyclable resources from waste images taken by mobile terminals. CNN-based transfer learning was used in the classification of images taken from actual household waste. We developed the original dataset ``STE'' and compared to TrashNet dataset. The classification accuracy varied depending on the types of waste, where ``paper'' or ``plastic'' waste was found to have relatively high accuracy in our STE dataset, particularly after background removal.
View full abstract
-
Yuta TAKAHASHI, Junichiro FUJII, Masazumi AMAKATA
Session ID: 1O3-GS-7-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The data in the civil engineering field is less data with much variety. Drone river patrols fly vast river areas and detect illegal dumping, including general garbage, using AI. The patrol drones are not constantly flying, they are rarely captured by aerial images, and it is even more difficult to detect temporary illegal occupation. In previous studies, it has been confirmed that adding images taken on the ground with different angles of view to the learning data improves learning, but the number of images is required for training even if the images are taken on the ground. In order to improve the learning of the detection model, this study verified whether the image for data augmentation can be generated and learned by image generation AI such as Stable Diffusion.
View full abstract
-
Shunsuke NAKAMATSU, Chiaki SAKAMA
Session ID: 1O3-GS-7-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The Escherization problem seeks a tiling for an input plane figure S with a new figure T such that T is as close as possible to S. In this study,we conduct an automatic generation of Escher-like "Metamorphose". More precisely, segmenting an object in an input image, we produce tile images using an affine transformation and introduce conditions for connecting tile images. As a result of experiments, we succeed in partly simulating altered patterns in Metamorphose and also produce new patterns using color images.
View full abstract
-
Koki MORI, Kazuya MERA, Yoshiaki KUROSAWA, Toshiyuki TAKEZAWA
Session ID: 1O3-GS-7-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In this paper, we propose a method to estimate speaker's depression score using acoustic features of his/her speech. 150 speech utterances that 15 subjects read 10 types of sentences were recorded as training data, and the depression scores of the subjects were calculated by Beck Depression Inventory (BDI) just after the recording. Acoustic features are calculated by using openSMILE or Surfboard, and Support Vector Regression or LightGBM are used for machine learning procedure. The experimental results showed that the estimated depression scores obtained a correlate efficient of 0.932 with the correct answer.
View full abstract
-
Kota SUEYOSHI, Takashi MATSUBARA
Session ID: 1O4-GS-7-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Energy-based model (EBM) is a deep generative model that learns data distribution by computing energy functions using neural networks. For concept composition using EBM, it is known that multiple concepts can be constructed by summing energy functions under the assumption of concept independence. However, assuming independence of concepts may lead to a lack of diversity in the generated data. Therefore, we propose a method to embed concept information into the latent space by order embedding. By performing a max operation among the coordinates of the embedded concepts, we can expect to generate images with the specified combination of concepts. Experimental results show that the proposed method can generate images with multiple concepts without assuming the independence of concepts.
View full abstract
-
Takehiro AOSHIMA, Takashi MATSUBARA
Session ID: 1O4-GS-7-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Deep generative models, such as generative adversarial networks (GANs), can generate high-quality images. However, these models often do not have an inherent way to edit generated images semantically. In order to edit generated images semantically, recent studies have proposed methods to determine linear or nonlinear semantic paths on the latent space and edit images by manipulating latent codes along these paths. However, the quality of the image editing along linear paths is inferior, and the image editing along nonlinear paths is non-commutative. In this study, we propose to discover semantic curvilinear coordinates on the latent space. We experimentally show that the quality of our method's image editing is better than comparison methods, and our method provides commutative image editing.
View full abstract
-
Sorachi KURITA, Satoshi OYAMA, Itsuki NODA
Session ID: 1O4-GS-7-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We propose a method to generate scene graphs using optimal transport loss as a measure to compare two probability distributions. In scene graph generation, learning with cross-entropy loss leads to biased predictions because the distribution of predicate labels in the dataset has severe imbalance. We apply learning with the optimal transport loss, which easily reflects similarity between labels as transportation cost, to the predicate classification in scene graph generation. In the proposed method, the transportation cost of the optimal transport is defined using the similarity of words obtained from the pre-trained model. The experimental evaluation of the effectiveness shows that the method achieves better performance than existing models in terms of mean Recall@50 and mean Recall@100. Furthermore, it can improve recall of predicate labels that are scarce in the dataset.
View full abstract
-
Shun KITAGAWA, Taro HATAKEYAMA, Komei HIRUTA, Atsushi HASHIMOTO, Satos ...
Session ID: 1O4-GS-7-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In this paper, we edit real images to out-of-distribution (OOD) images by combining NaviGAN and GAN Inversion. NaviGAN is a GAN technique to modify generated results, generally in-distribution of training data, to OOD ones. When NaviGAN is applied to real images, GAN Inversion is required to embed real images into the GAN latent space. We aim to use this combination to exaggerate specific parts of real face images while maintaining their identity. Among various GAN Inversion methods, we found methods included in fine-tuning type are the best to be combined with NaviGAN. Thus, we propose the combination of the fine-tuning type method and NaviGAN for our goal. Compared to other methods, in experiments, we confirmed that the proposed combination achieves the best exaggeration of specific parts while maintaining identities. Additionally, we show that our method can be applied not only to real pictures but to Manga characters.
View full abstract
-
Yasuhito FUJISAWA, Haruka YAMASHITA
Session ID: 1O4-GS-7-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In question-and-answer sessions on Q&A services, it is sometimes difficult to read a long or poorly-written question and answer. In such cases, if the system can recommend appropriate images for the questions, it can assist reading comprehension based on the information in the images. In this study, we propose a machine learning model for recommending appropriate images for questions in a Q&A service using Sentence-BERT (SBERT). Specifically, the model achieves this by converting question sentences and image captions into a vector using SBERT, measuring the cosine similarity between them, and recommending the image with the caption that has the maximum value. From a practical point of view, it is also necessary to minimize inappropriate recommendation results when SBERT malfunctions. Therefore, in order to ensure that the recommended images are at least correctly recommended from the categorical point of view, a categorization model based on BERT's transfer learning is applied as an auxiliary. This is achieved by classifying the recommended images into categories that exist in each Q&A service and performing SBERT and cosine similarity measures within each category.
View full abstract
-
Jumpei NAKAO, Masaru ISONUMA, Junichiro MORI, Ichiro SAKATA
Session ID: 1O5-GS-7-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Text-to-Image models require datasets consisting of a huge number of image-caption pairs for training. Since the captions in such datasets are manually annotated, they are not necessarily optimal for training text-to-image models. In this study, we propose a learning framework that trains Text-to-Image models while optimizing the captions used for training. Specifically, we introduce a model that outputs pseudo captions from images and alternately update the parameters of the model and the Text-to-Image model through bilevel optimization. In the experiment, we evaluate the effectiveness of bilevel optimization for learning Text-to-Image models as a preliminary effort.
View full abstract
-
Kazuhiro ONISHI, Taro WATANABE
Session ID: 1O5-GS-7-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
A saliency map prediction system using ViNet is proposed to improve the accuracy reduction dependent on the speed of moving objects in saliency map prediction considering the characteristics of spatiotemporally interwoven video and still image domains. By solving the problem of partial accuracy reduction in advertising videos with intense motion, and by outputting saliency maps that are closer to the human gaze, the system improves the accuracy of video advertising production and further enhances brand lift and recognition effects. Stable output is confirmed in qualitative evaluation using test-produced video advertisements, and improved accuracy is obtained in quantitative evaluation using multiple indicators. This study will further accelerate a new production flow for advertising videos that takes into account the viewer's perspective in advance.
View full abstract
-
Keisuke MAESAKO, Liang ZHANG
Session ID: 1O5-GS-7-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Recently, aerial communication platforms that can provide communication services from aircraft have been attracting attention. We are considering a new service by using object detection technology on aerial images taken with an additional onboard camera. However, object detection by using aerial images has a problem in that the object's appearance changes depending on the positional relationship between the camera and the object. To solve this issue, we proposed to create a specialized object detection model by classifying patterns of images taken using automobiles as an example and training the model for each pattern. Performance evaluation confirmed that the specialized object detection model can achieve more than 30% higher Average Precision compared with using the general-purpose model. Therefore, we can expect to improve the accuracy of object detection in aerial images by using an appropriate specialized object detection model according to the positional relationship between the camera and the object.
View full abstract
-
Taiho TAKEUCHI, Yoshifumi SEKI, Yoshinao SATO
Session ID: 1O5-GS-7-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In this study, we aim to apply machine learning techniques to first-person videos and perform a detailed analysis of the experimental results using the existing method, Ego-Exo. In recent years, machine learning research on first-person videos has become popular. However, detailed analysis of the output of prediction models has not been published much, and knowledge for practical application is lacking. The results of the analysis suggest two findings. Firstly, the performance of label prediction depends on the number of samples of each label. We found that labels with a large number of samples have high prediction performance. Secondly, label prediction performance is high for obvious actions and objects, and low for other labels. These findings are important for building datasets for domain-specific tasks.
View full abstract
-
Naoto TANJI, Toshihiko YAMASAKI
Session ID: 1O5-GS-7-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
For better production of effective online advertising, predicting its effectiveness in advance is of prime importance. Since the images of display advertisements distributed on the internet have various aspect ratios, the effectiveness of the advertising images can be more accurately predicted by taking the aspect ratio of the images into account. We propose a Vision Transformer model that can handle images of arbitrary aspect ratios using relative position bias. We apply it to the task of click through rate prediction using real advertising delivery data, and confirm its superiority over baseline models that resize images to a fixed aspect ratio.
View full abstract
-
Takuro FUKUDA, Shun SAWADA, Hidefumi OHMURA, Kouichi Fukuda KATSURADA, ...
Session ID: 1O5-GS-7-06
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Speech synthesis from neural decoding Although of analysis of speech- imagery recognition using electroencephalogram (EEG) could become a strong direct-communication tool in brain-computer interface (BCI). has been actively conducted. In this report, we propose a complex cepstrum-based accent discrimination from speech-imagery EEG signals. We first create a word-database with different accentuation that has hand-labeled short-syllables in imagined words after the pooling process of electrodes. created a database containing the intervals of imagined spoken syllables that is visually labeled from the line spectral patterns of EEG signals obtained after the pooling process of electrodes.created a database containing the intervals of imagined spoken syllables that is visually labeled from the line spectral patterns of EEG signals obtained after the pooling process of electrodes. Then, we design construct an accent discriminator based on using the the complex- cepstrum calculated from the amplitude- spectrum and phase-spectrum offrom the EEG signals during speech-imagery. The experiment of high and low pitch accents is conducted using three imagined words with same syllables but different accentuation. In the recognition stage, two approaches of SM and CNN are compared, and the eigenspaces are designedthe classifier based on CNN shows high performance.
View full abstract
-
Yoshiko ARIMA
Session ID: 1P4-OS-16a-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Group decision prediction using MatLab's LSTM network model was performed on the chat logs collected in the mock jury experiments. LSTM is a deep-learning classification method for long-term dependency learning on sequence data. Participants in the experiment were given information about a fictitious murder case and made group decisions in 4-person discussion groups, choosing between "guilty," "not guilty," and "presumed innocent. " The group decisions were predicted from the chats of each discussion group with 82.3% accuracy. However, a sufficient number of judgments of guilt, not guilty, and presumed innocence had to be included in the test data used for the training session. In this experiment, we set the conditions with the information distributed to each group member. An analysis using the LDA topic model was conducted to explore changes in chatlogs due to the allocated information for each member. An ANOVA revealed that the experimental conditions, "sharing (high/low) X information (alibi/dummy)," had effects on the topic proportions. Semantic memory (false memory) measured before and after the group decisions were also correlated with topic proportions. Based on these findings, we will discuss the reliability and usefulness of using machine learning in experimental psychological research.
View full abstract
-
Case of Afghanistan’s first ever Internet-based Idea Competition on Solid Waste Management
Sofia SAHAB, Jawad HAQBEEN, Takayuki ITO
Session ID: 1P4-OS-16a-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper focuses on the problem of managing crowdsourced deliberation, with the aim of partitioning deliberation among a set of moderated democratic techniques over multiple sessions of deliberation in idea contest. Specifically, we propose that large-scale deliberation should be separately performed over multiple sessions such that (1) large-scale deliberation should be done by machine agency first, (2) then human expert should manage over the course of medium-scale deliberation, and finally (3) the crowd should be allowed to perform the small-scale deliberation. We first provide an overview of the hybrid deliberative initiative that is being used to facilitate deliberation in a real-world idea contest. Next, we conduct an actual idea contest in collaboration with a governing agency to compare the efficacy of each deliberation technique in managing the contest.The contested generated 14,587 opinions from 3,892 registered participants, were collected in 2020 by internet forum over three sessions, which manage by machine ranking, human expert rating and crowd voting tools, respectively. Our results demonstrate that our initiative manages meaningful deliberation in the civic process. Finally, we discuss the advantages of our initiative over previous approaches and advocate for governing agencies of idea contests to adopt our initiative for future large-scale idea contests.
View full abstract
-
Kazuhito MORI, Takayuki ITO
Session ID: 1P4-OS-16a-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The study focuses on the challenge of stagnant online discussions, especially with few participants, and presents a prototype discussion participation agent based on GPT to stimulate discussions. The agent’s functions include automatic topic generation and interaction with users in discussions. The effectiveness of the agent was examined in an online discussion experiment using D-Agree, a discussion platform developed in our laboratory. The results showed that the scores of the questionnaires on discussion satisfaction and ease of posting improved, suggesting the usefulness of this discussion participation agent. Future work includes adjusting the number of comments so that users are not discouraged from speaking, and verifying the effectiveness of introducing agents through large-scale experiments.
View full abstract
-
Wen GU, Shinobu HASEGAWA, Takayuki ITO
Session ID: 1P4-OS-16a-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In the divergence phase of a discussion, the accumulation of opinions from various perspectives becomes the foundation of leading the discussion to consensus. To promote the opinion generation in the online forum, facilitation is introduced and aims to encourage participants to generate posts as many as they can. Most online forum facilitation is conducted by few independent human facilitators. Therefore, problems such as human bias and facilitation scalability become inevitable. To tackle these problems, automated facilitation support approaches such as experience-based approaches and argumentation-based approaches have been proposed. However, most of these research modeled the facilitation issues by only focusing on the discussion contents while less research analyzes the facilitation paradigm from human facilitators consideration perspective. To fill this research gap, we proposed to model the human facilitator-based facilitation paradigm that aims to promote post generation in the online forum divergence phase. And the proposed models are validated from the facilitation generated in the human facilitator involved real-world online forum discussions.
View full abstract
-
Based on a simulated citizen participation workshop concerning the treatment of removed soil outside Fukushima prefecture
Yukihide SHIBATA, Yume SOUMA, Mie TSUJIMOTO, Honami UE, Nana KIHARA, T ...
Session ID: 1P4-OS-16a-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In public deliberation, citizens are expected to discuss the issues referring to common goods such as utilitarianism, equality principle, and maximin principle. The present study conducted a simulated citizen participation workshop concerning the treatment of removed soil outside Fukushima. The purpose of this study was to visualize the quality of the discussion using the Discourse Quality Index (DQI) and text-mining. Results from the DQI found that utilitarian perspectives such as minimizing risks and costs were most frequently mentioned in the common goods, while few statements were made from the perspective of maximin principle. In addition, many concerns of reputational damage and public understanding were referred. Text-mining analysis revealed that "Fukushima" did not appear many times, and "safety" was strongly associated with recycling. These indicate that the discussion in the workshop mentioned more the risks and people's reactions to them than the burdens of Fukushima residents.
View full abstract
-
Sora MATSUMOTO, Shun SHIRAMATSU, Takashi IWATA, Hidekazu AOSHIMA, Ekai ...
Session ID: 1P5-OS-16b-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Especially in Japan, most people tend to avoid participating in discussions and consensus building in society and organizations, which causes inappropriate consensus building. In this study, we focus on the element of self-efficacy to guide behavioral modification to solve this problem. We aim to improve efficacy by using a chatbot that automatically evaluates the user's opinion and provides feedback. As a result of preliminary discussion experiment, although the effectiveness of the proposed method on efficacy is not statistically proved, the result suggests that efficacy correlates with participation in discussions.
View full abstract
-
Takuya YOKOTA, Yuri NAKAO
Session ID: 1P5-OS-16b-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Because of the opacity involved in machine learning techniques, it is necessary to ensure accountability and fairness when machine learning is used in decision-support systems in government and business. The requirements for accountability and fairness depend on the values of the stakeholders affected by the system's decision-making. However, there is a lack of discussion on the appropriate outputs for each stakeholder. This paper proposes a framework for ``Stakeholder-in-the-Loop Fair Decisions'' to determine accountability and fairness requirements and discusses how to consider appropriate outputs for the four stakeholders. In addition, as an example of our efforts to extract the diverse values of stakeholders and integrate them into an output that all stakeholders agree on, we introduce our empirical study of stakeholders in job-matching AI through a crowdsourcing experiment. To ensure the accountability and fairness of job-matching AI, we explore the possibility of a system that numerically extracts stakeholders' values through questionnaires and explains who benefits/loses in the integrated output in the experiment.
View full abstract
-
Yihan DONG, Shiyao DING, Takayuki ITO
Session ID: 1P5-OS-16b-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
The application of automated facilitators in the issue-based information system (IBIS) has been proven sufficiently practical in the past few years, especially for guiding users to solve particular problems. However, since the original automated facilitator was designed to only respond to users’ existing speech to promote other users’ opinions, two classic situations, such as discussions are not sufficient because the participants are not familiar with each other and some dominant participants would lead the discussion into a meta one, often occurred in the past scenarios. The discussions that include phenomenons mentioned above are defined as non-inclusive discussions because not all participants could express themselves sufficiently enough. Therefore, the functions of ice-breaking and promoting the discussion based on different phases are necessary to create and reinforce an open, positive and participative environment for inclusive discussion, to further promote more opinions and to make discussions satisfying. As the result, the design, implementation and results of how automated facilitators promote discussions more properly with the functions mentioned above are demonstrated in this paper. Based on the design right now, the further combination with IBIS label classification and the algorithm for measuring discussion progress will be implemented.
View full abstract
-
Jawad HAQBEEN, Sofia SAHAB, Takayuki ITO
Session ID: 1P5-OS-16b-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Textual data has emerged as one of the fastest-growing data types on the internet. This development has led to significant advancements in the field of Natural Language Processing (NLP) in recent years, primarily driven by the utilization of Deep Learning (DL) and Machine Learning (ML) techniques. These methods are known to require copious amounts of labeled text data in a specific format and structure for model training purposes using some sort of dialogue mapping. For instance, node and link extractor models have been trained in D-Agree using text-based training data while adopting Issue-based Information System (IBIS) notation. However, training such models in English has been challenging due to the arduousness of preparing labeled IBIS English datasets. In this study, we present a process for annotating and releasing large quantities of training data for machine learning based on IBIS, providing researchers with a free environment to train their opinion extractor models in English.
View full abstract
-
Differences in DQI Scores by Socio-demographic Characteristics
Tomoyuki TATSUMI, Takashi NAKAZAWA
Session ID: 1P5-OS-16b-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper examines fairness in mini-public deliberations, where randomly selected citizens discuss public issues. While they have gained popularity as a practice of deliberative democracy, it's unclear if participants can discuss equally. In this study, discussion data from a deliberative event modeled after a deliberative poll in 2019 were coded using an evaluation index based on the Discourse Quality Index (DQI). In addition to the number of statements, individual DQI scores were used as an index of deliberative ability, and differences in mean values by social attribute (gender, generation, and educational background) were examined. Results showed no generation differences, but higher educational backgrounds led to higher DQI scores for 'Opinions.' Females had higher DQI scores for 'Opinions,' 'Reasons,' and 'Emotion,' despite making fewer statements than males.
View full abstract
-
Masaru HIRAKATA, Chong MA, Kiwamu KASE
Session ID: 1Q3-OS-7a-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
In order to realize DX in the manufacturing industry, in addition to improving operational efficiency through the introduction of third-generation AI (Narrow AI), there is a need for collaboration that enables dialogue with humans and literature/data dialogue (subjective reading by AI). The collaborative AI (Artificial General Intelligence) is expected. Although the current AI has become able to respond habitually (based on statistical relationships), it is not at the stage where it can proactively learn and have a dialogue based on meaning. The abilities and functions required of the fourth-generation AI, which is beyond the problem-solving of the third-generation AI (Narrow AI), are proactive actions, i.e., predictive actions performed by humans, planning actions, and interpretation (using images and knowledge). It is to equip them with so-called intelligence (thinking mainly of non-cognitive skills), such as reading while supplementing. In this paper, we systematically organize behavior/learning (neural network architecture)from the perspective of intelligence development. We report on the prospects of fourth-generation AI (neural network system) models.
View full abstract
-
General information processing x Entification-related processing
Hiroshi YAMAKAWA, Yutaka MATSUO
Session ID: 1Q3-OS-7a-02
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
At the beginning of the 21st century, the research goal of general-purpose artificial intelligence (AGI) - the ability to solve various tasks in a single system - is widely accepted. However, the weakness of this goal is that it is difficult to define the range of tasks to be performed entirely. Therefore it is impossible to set sufficient conditions for the completion of AGI. Therefore, this study proposes a framework of a comprehensive technology map as a feasible goal for advanced AI, with the completion of AGI as a sufficient condition for realization. Next, Entification-related processing and general information processing are positioned as essential components of the world model. Then, by mapping McKinsey's business-related capabilities to the processing in the above world model, we propose a comprehensive technology map based on the processing in the world model. By realizing all the various Entification-related processing and general-purpose information processing on this map, it will be possible to develop human-level AI.
View full abstract
-
Kyohei KAWAGUCHI, Daiki SHIMOKAWA, Satoshi KURIHARA
Session ID: 1Q3-OS-7a-03
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Artifact around us is made top-down,constructed gradually from small ones based on the blueprint.On the other hand, in hierarchical structure of size scale, human body is made bottom-up,in which large individuals gradually emerge from interactions of small individuals by chance, such as ``cell → tissue → organ → human''.In 3D virtual environment, We proposed the model of a virtual creature composed of cubes which extends its body and emerges in the upper layers.A virtual creature joints, rotates and copies bodys once emerged to emerge in the upper layers efficiently.Tasks which are more difficult exponentially are set and emerge in the upper layers when virtual creatures complete them.As a result, a virtual creature's body emerged exponentially according to hierarchical structure of environment and we created virtual world of hierarchical structure of size scale like real world. In the future, we need to establish and investigate methods to optimize body functions and behaviors.
View full abstract
-
Extraction of graphical structures from brain schematics in the literature
Iriya HORIGUCHI, Yuta ASHIHARA, Hiroshi YAMAKAWA
Session ID: 1Q3-OS-7a-04
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Brain Reference Architecture (BRA), an approach to developing brain-based software, uses Brain Information Flow (BIF), which extracts mesoscopic-level information about brain anatomy, to design a functional component diagram (HCD).While the vast and varied anatomically consistent findings needed to be extracted comprehensively, BIF is manually extracted from the neuroscience literature.Therefore, in this paper, we develop an object detection model that extracts connections between brain regions from the neuroscience literature. In our experiments, we achieved mAP of 0.899 for the position of text representing the names of brain regions, and mAP of 0.856 for the arrows representing the connections between brain regions.
View full abstract
-
Shunsuke OTAKE, Katsuyoshi MAEYAMA, Shoichi HASEGAWA, Takeshi NAKASHIM ...
Session ID: 1Q3-OS-7a-05
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
We construct and implement a concrete computational model based on a hippocampal formation-inspired probabilistic generative model (HF-PGM) and evaluate the effectiveness of the proposed model. HF-PGM does not specify the architecture or probability distribution of the model. In this study, we propose a probabilistic generative model consistent with HF-PGM by integrating the Recurrent State-Space Model (RSSM), one of the world models, and Simultaneous Localization and Mapping (SLAM)'s model based on the occupancy grid map. Global localization was performed in a simulated environment to evaluate its performance in experiments. We showed that the proposed model improves performance over conventional self-localization methods. We also evaluated the performance of the integrated world model concerning location categorization using a latent space representation.
View full abstract
-
Naoto YOSHIDA, Hosninori KANAZAWA, Yasuo KUNIYOSHI
Session ID: 1Q4-OS-7b-01
Published: 2023
Released on J-STAGE: July 10, 2023
CONFERENCE PROCEEDINGS
FREE ACCESS
Homeostasis is a fundamental property of animals that maintains the body's internal state. Homeostatic reinforcement learning (homeostatic RL) has been used to study how behaviors emerge from homeostasis, but previous studies have been limited to small-scale problems. This study focuses on scaling up homeostatic RL to enable the emergence of behaviors in high-dimensional input and continuous motor control. Deep RL is applied to the homeostatic RL domain, and the most effective reward setting is identified. An attention mechanism is also incorporated into the policy network structure to facilitate learning of appropriate behavior based on the body's internal state. This work provides insights into how homeostasis can be used to explain animal behavior and how homeostatic RL can be applied to more complex problems.
View full abstract