Proceedings of the Annual Conference of JSAI

High-speed multivariate time series prediction using Echo State Network

Kazuki OTAKE, Jun ROKUI

Session ID: 3E4-GS-2-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3E4GS202

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recently, time series analysis using machine learning has been actively carried out, and it has been applied in various fields. Real-time prediction is important in real-time data prediction such as stock and traffic conditions. Many time-series prediction models perform large-scale learning using a large amount of data, so computational costs are large and impractical. In this research, we propose a time-series prediction method using Echo State Network capable of rapid learning. It was confirmed experimental that the rapid and high-performance learning model can be constructed by applying Echo State Network to the multivariate time series.

View full abstract

Download PDF (398K)
Comparing Accuracy of Time Series Forecasting Methods

Junichi SEKITANI, Harumi MURAKAMI

Session ID: 3E4-GS-2-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3E4GS203

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

It is difficult to decide which model or method should be chosen to accomplish the task of time series forecasting. The purpose of this research is to create a simple experimental framework for selecting time series forecasting methods by employing an optimal balance of statistical and machine learning models as representative methods. We adopted benchmarks from the M4 Competition and added gradient boosting and other methods commonly used in machine learning competitions. Accordingly, experiments were conducted to compare the accuracy of time series forecasting methods using data from the M4 Competition.

View full abstract

Download PDF (284K)
Evaluating Out-of-Distribution Detection Using Deep-Learning Based Methods on Time-Series Data

Daichi KIMURA, Tomonori IZUMITANI, Kenichiro SHIMADA, Kenji KASHIMA

Session ID: 3E4-GS-2-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3E4GS204

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

It is necessary to detect the out-of-distribution of the time-series data because the difference in the distribution of the data between training and operation may affect the estimation results.AutoEncoder is one of the most well known methods for out-of-distribution detection. However, in recent years, it has been reported that AutoEncoder-based method often fails due to undesirable reconstruction of the out-of-distribution data in experiments using images. To deal with this problem, many generative model-based approaches using adversarial generative models have been proposed.Most of these methods have been performed on image data, and the performance of out-of-distribution detection on time-series sensor data is not fully explored. In this study, we evaluate and discuss the performance of the method on artificially generated data and real time series data.

View full abstract

Download PDF (502K)
Causal Discovery for Nonstationary Nonlinear Time Series Data Using Just-In-Time Modeling

Daigo FUJIWARA, Kazuki KOYAMA, Keisuke KIRITOSHI, Tomomi OKAWACHI, Tom ...

Session ID: 3E4-GS-2-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3E4GS205

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Causal discovery from multivariate time series data is becoming important as the increase of analysis for IoT data. However, it is not easy to identify the causal structure from such data due to their non-stationarity or distribution shifts using conventional linear causal discovery methods. The application of non-linear methods is also limited because of their computational complexity. To address these problems, we propose a causal discovery method based on the Linear Non-Gaussian Acyclic Model (LiNGAM) and the Just-In-Time (JIT) framework. The method estimates a local linear structural causal model from neighboring samples of the past data every time a new input sample is given. We show the effectiveness of the method by numerical experiments using artificial data with non-stationarity and non-linearity.

View full abstract

Download PDF (579K)
Development of Non-Contact User Interface by Hand Gesture Recognition Using Deep Learning

Daiki ISHIGURO, Tomoko OZEKI

Session ID: 3F3-GS-9-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F3GS901

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study, we develop a non-contact user interface that can recognize hand gestures from input images obtained by a monocular RGB camera and operate a web application. We aim to realize a gesture manipulation system that can be easily operated on general-purpose mobile devices by using only a monocular RGB camera, while there are prior examples of UIs that use infrared sensors and motion sensors to operate objects without touching the operating device. First, we collect the coordinate points of each joint of the hand detected by MediaPipe, a machine learning library, as training data, and classify them into several gestures by deep learning. In addition, it manipulates the map displayed on the application in a way that reflects the recognized hand gestures. We also trained the system using different networks, such as MLP, CNN, and LSTM, to verify the accuracy of each network and to select the most suitable network. We have achieved 94% accuracy of gesture recognition by LSTM and built a NUI system which is available in mobile devices.

View full abstract

Download PDF (478K)
Motivational effect of traditional culture experience using AR Tosenkyo application

Kenta MIZOBUCHI, Hung-ya SAI, Yuya IEIRI, Reiko HISHIYAMA

Session ID: 3F3-GS-9-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F3GS902

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this paper, we clarified that augmented reality (AR) technology increase the motivation to experience traditional culture and contribute to motivating people. We developed an AR Tosenkyo application that experience a traditional Japanese game of "Tosenkyo" and conducted experiments with 47 participants, including foreigners. Since the components of the Tosenkyo AR application can be broken down into play equipment, backgrounds, etc., we prepared three applications with different degrees of AR environmental conditions to investigate whether the differences affect motivation. The results of the experiment showed that Japanese were relatively less motivation to experience traditional culture before the AR experience compared to Chinese, but after the AR experience, their motivation increased significantly. This tendency was more pronounced in a harmonic AR environment, which considered a moderate fusion of reality and AR. Furthermore, path analysis of the questionnaire results showed that the higher quality of the AR application, the more effective to increase motivation.

View full abstract

Download PDF (556K)
Uniform Test Assmbly using Zero-suppressed Binary Decision Diagrams

Kazuma FUCHIMOTO, Shin-ichi MINATO, Maomi UENO

Session ID: 3F3-GS-9-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F3GS903

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recently, the necessity of “uniform test forms” for which each form comprises a different set of items but still has equivalent measurement accuracy has been emerging. An important issue for uniform test assembly is to assemble as many uniform tests as possible. Although many automatic uniform test assembly methods exist, the maximum clique using the integer programming method is known to assemble the greatest number of uniform tests with the highest measurement accuracy. However, the method requires one month or more to assemble 450,000 tests due to the high time complexity of integer programming. This study proposes a new uniform assembly using zero-suppressed binary decision diagrams (ZDD). A ZDD is a graphical representation for a set of item combinations. This is derived by reducing a binary decision tree. In the proposed method, each node in the binary decision tree corresponds to an element of an item bank and has two edges if the item (node) is contained in a uniform test. Furthermore, all equivalent nodes (having the same measurement accuracy and the same test length) are shared. Finally, the proposed method can assemble 450,000 tests within 24 hours.

View full abstract

Download PDF (340K)
Recognition of Tapping Motion Based on Wrist Electromyography

Yuki SAITO, Momoyo ITO, Shin-ichi ITO, Minoru FUKUMI

Session ID: 3F3-GS-9-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F3GS904

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, along with the information society, researches on human interfaces have been actively conducted. The surface electromyography (EMG) research is a field where various applied researches are being made in many kinds of human interfaces. In this paper, for the purpose of developing an object manipulation system by 5-finger tapping, we classified and identified finger movements based on EMG signals measured from the wrist using sensors. A sensor must be attached to measure the surface electro myography. With a view to introducing it into daily life, we thought that it would be best to use a dry type sensor in order to use a device that is easy to put on and take off and has a low running cost. In the system, outlier processing, FFT, noise removal, and smoothing processing were performed on the preprocessing section. In addition, the accuracy was calculated by machine learning using SVM. As a future task, it is necessary to improve operation identification accuracy.

View full abstract

Download PDF (764K)
Negative bias in human cognitive judgment about AI-generated information and its underlying neural mechanisms

Satoshi NISHIDA

Session ID: 3F4-OS-23-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2301

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Despite recent splendid advantages in artificial intelligence (AI), bad images of AI have not been removed from humans. This study investigates whether such bad images degrade human preference to AI-synthesized visual information independently of its appearance itself. To this end, experimental participants performed the attractiveness rating of various faces, synthesized by a generative adversarial network, under the fake instruction in which half of the faces were synthetic and the other half were real. This design enables to evaluate the effect of the participants’ belief itself on their attractiveness rating. The results show that the instruction of synthetic faces not only reduced attractiveness rating but also changed neural activation patterns in widespread cortical regions. This finding provides behavioral and neural evidence to support the notion that human preference to visual information is negatively biased depending solely on the belief about the information being AI-synthesized.

View full abstract

Download PDF (728K)
How can we understand AI as a model of human cognition?

Shohei HIDAKA

Session ID: 3F4-OS-23-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2302

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Artificial intelligence is a foundational technology applied to the modern society. Deep learning models, the core techniques in the modern AI, lacks, however, human readability or comprehensiveness of their internal structure. I consider representational arbitrariness in the AI model as a universal function approximator, and point out how human cognition and its bias plays a crucial role in choosing a natural representation for a given task. As many of AI models inspired and mimicking human cognition, I hypothesize that understanding cognitive computation would lead a principled design of the new AI architecture, that learns both data and how the data should be preprocessed.

View full abstract

Download PDF (378K)
Analysis of human behavioral strategies for agents with active and passive strategies

Kensuke MIYAMOTO, Norifumi WATANABE, Yoshiyasu TAKEFUJI, Osamu NAKAMUR ...

Session ID: 3F4-OS-23-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2303

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In humans' cooperative behavior, there are two types of behavioral strategies: passive behavioral strategies based on the others, and active behavioral strategies based on the objective-first. In order to realize a robot that can use different strategies and communicate like a person, we created an agent that can switch between active and passive strategies. However, it is not clear whether people change their own behavioral strategies according to each strategy. In this study, we conducted an experiment in which agents with multiple strategies of actively giving way and passively giving way passed each other in a grid-like space, and analyzed whether people's behavior changed when the agents' strategies changed. The results show that, in addition to subjects who change their own behavior in response to changes in the agent's strategy, there are also subjects who behave in a certain way regardless of the agent's strategy and subjects whose behavior is not clearly divided.

View full abstract

Download PDF (754K)
Possibilities and ethics of artificial intelligent agents for facilitating social relationships

Hiro Taiyo HAMADA, Ryota KANAI

Session ID: 3F4-OS-23-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2304

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Well-being is a critical factor to flourish our lives and societies. There is a new movement in which AI intervenes well-being, the so-called well-being AI. Previous applications of well-being AI have targeted individual well-being. However, there has been little research and development of well-being AI to intervene in groups while human relationships within groups such as organizations and social clubs are also known to contribute to well-being. In this presentation, we will summarize the literature on well-being AI, and discuss potential and ethical issues of AI agents that intervene in group relationships from a mediative approach. Through these discussions, we will revisit the issue of the symbiosis between AI agents and humans.

View full abstract

Download PDF (485K)
Consciousness and AI

Takuya NIIKAWA

Session ID: 3F4-OS-23-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2305

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper addresses the issues over the possibility that AI and robots have consciousness. In particular, I discuss how we can examine whether AI and robots have consciousness, what kinds of consciousness they may have, and how much moral consideration should be given to them.

View full abstract

Download PDF (400K)
Exploring a virtue-ethics approach to AI ethics

Katsunori MIYAHARA

Session ID: 3F4-OS-23-06
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3F4OS2306

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper aims to introduce the basic ideas of “virtue ethics” and explore how they might contribute to AI ethics. It presents philosopher Shannon Vallor’s virtue-ethical study into the ethical implications of care robots as a case study and holds that the virtue ethics perspective can contribute to AI ethics in two respects: (1) it can identify ethical problems related to the development and implementation of AI systems that are easily overlooked from other ethical standpoints; (2) it can provide a principled guide to the development and implementation of AI systems. It concludes by considering this perspective’s implications for future investigations into the ideal form of co-existence between humans and AI.

View full abstract

Download PDF (418K)
Estimation of Cognitive Function from Driving Data

Ryusei KIMURA, Takahiro TANAKA, Yuki YOSHIHARA, Kazuhiro FUJIKAKE, Hit ...

Session ID: 3G3-OS-15a-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G3OS15a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Traffic accidents by older drivers due to cognitive decline have become a serious problem. Driving assistance systems that support the driver by adapting individual cognitive functions can provide appropriate feedback and prevent traffic accidents. To realize such systems, we developed a regression model to estimate a driver's cognitive function from on-road driving data. First, we segment driving time-series data into two road types, namely, arterial road and intersections, to consider driving situations. Second, we segment data further into many sequences with various duration. Finally, statistics are calculated from each sequence and they are used as input features of machine learning models. Our method can capture various duration of important driving behaviors. The experimental results show that our model can predict scores of Trail Making Test B and Useful Field of View test with $r$ of 0.747 and 0.634, respectively. Additionally, we reveal important sensor and road types for estimation.

View full abstract

Download PDF (400K)
Preliminary Investigation of Using Crowd-sourced Photos with Wi-Fi Signals for Predicting Indoor Location Class

Teerawat KUMRAI, Takuya MAEKAWA, Kazuya OHARA, Yizhe ZHANG, Joseph KOR ...

Session ID: 3G3-OS-15a-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G3OS15a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Due to the recent evolution and proliferation of smartphones and the social network service (SNS), there are a huge amount of images taken by smartphones at various places that have been uploaded to SNS. Furthermore, various sensors in smartphones such as camera and Wi-Fi modules enable us to easily generate a camera image associated with the sensory information that represents the context in which the image was taken. Therefore, this work investigates a method for using the benefits of camera images associated with Wi-Fi signal strength information to predict indoor location class for shopping complexes. Our method first estimates the store at which a camera image was taken by analyzing the image and web images of branch stores of store chains. Then, the floor plan is used to determine the 2D coordinates of the images taken at branch stores. A transformation function, that maps Wi-Fi signals onto the 2D coordinates, is then constructed using Wi-Fi signals of the branch store images and their estimated 2D coordinates. The function is adopted to predict the indoor location class of images associated with Wi-Fi signals. Moreover, our transformation function has novel features for addressing the non-linearity of the Wi-Fi space, generating virtual Wi-Fi scans on the floor, and training on unlabeled Wi-Fi signals.

View full abstract

Download PDF (500K)
Anisotropic estimation of station market areas by supervised learning using geospatial information and IC commuter pass data

Yohei KODAMA, Yuki AKEYAMA, Yusuke MIYAZAKI, Koh TAKEUCHI

Session ID: 3G3-OS-15a-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G3OS15a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

When estimating demand for new stations, railroad companies estimate the station coverage area that is a type of trade area where stations are used by people around the station. Conventionally, the station coverage area is estimated using statistical data, but this method does not consider spatial anisotropy and results in significant errors. Recently, with the advent of IC commuter pass services, large-scale spatial data has become available, and we expect that accurate station coverage can be estimated with finer spatial granularity. In this study, we define station coverage based on the number of IC commuter pass holders per zip code, and formulate station coverage estimation as a predicting problem. We propose a method for estimating the station coverage area by supervised learning using geospatial information such as the time required to get from a zip code to a nearby station and the geographic relationship. Experimental results show that our proposed method reduces the estimation errors.

View full abstract

Download PDF (782K)
Multi-objective Deep Reinforcement Learning for Crowd Guidance Policy Optimization

Ryo NISHIDA, Yuki TANIGAKI, Masaki ONISHI, Koichi HASHIMOTO

Session ID: 3G4-OS-15b-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G4OS15b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The objective of this study is to improve Multi Objective Deep Reinforcement Learning (MODRL) for optimizing crowd guidance strategies. In general, MODRL is classified into Outer-loop method and Inner-loop method. In the former, multiple objective functions are transformed into a single objective using a scalarization function, and the Pareto front, which is the optimal solution set, is obtained by repeatedly updating the weights of the scalarization function and performing single-objective optimization. However, in this method, if the computational cost of single-objective optimization is high, the overall computational cost increases in proportion to the number of times the weights update. On the other hand, the latter the Inner-loop method is designed to learn Pareto front in a learning process. In this study, we examine the approximation of the Pareto solution by different action selection methods of Pareto-DQN, which is a typical method of the Inner-loop method. In the experiments, we evaluate the proposed method using a benchmark problem and finally discuss its application to the optimization of crowd guidance strategies.

View full abstract

Download PDF (499K)
A basic study on the mechanism of group behavior of wild bats using movement pattern measurement and granger causality during nesting

Kazusa USHIO, Emyo FUJIOKA, Keisuke FUJI, Hitoshi HABE, Hiroaki KAWASH ...

Session ID: 3G4-OS-15b-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G4OS15b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Bats recognize their surrounding environment by processing the echoes of ultrasonic waves emitted by themselves. Many species of bats live in groups, and many individuals emerge together from roosts. In this study, we used high-sensitivity video cameras to measure the flight trajectories of bats emerging from the cave in three dimensions, and investigated their flight trails. As a result, we found there were three behavioral patterns during emerging: exiting the cave, returning to the cave, and some other action. In addition, we applied the Granger causality method (Fujii et al., NeurlPS'21) to analyze the swarm behavior mechanism of emerging bats. The results showed that forward individuals flew in such a way that they were "repulsed" from or "approached" the other individuals. This suggests that bats, which use sound to understand their environment, are also influenced by backward individuals, which cannot be captured visually, suggesting that bats have a unique swarming mechanism that differs from model animals for group behavior, mainly visual animals.

View full abstract

Download PDF (1258K)
Diversity of behavioral strategy in cooperative hunting using multi-agent deep reinforcement learning

Kazushi TSUTSUI, Kazuya TAKEDA, Keisuke FUJII

Session ID: 3G4-OS-15b-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G4OS15b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Cooperative hunting is a widespread form of cooperation in nature, and it is known that the level of organization of this predation varies among species. However, how cooperative forms of predation have evolved and been maintained is not well understood. In this study, we addressed this issue using a multi-agent simulation based on deep reinforcement learning. We examined changes in behavioral strategies when changing factors that have been suggested to be associated with predation forms by previous observations in nature, and found that the highest level of organization with role division among individuals was emerged under the combined conditions of two factors: difficulty of prey capture, and food (reward) sharing. These results suggest that sophisticated predation forms, which have been thought to require high cognition, can evolve from relatively simple cognitive and learning mechanisms, and emphasize the close link between the predation form and the environment where the organism lives.

View full abstract

Download PDF (872K)
Estimating the Effect of Team Hitting Strategies Using Counterfactual Virtual Simulation in Baseball

Hiroshi NAKAHARA, Kazuya TAKEDA, Keisuke FUJII

Session ID: 3G4-OS-15b-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G4OS15b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In baseball, every play on the field is quantitatively evaluated and has an effect on individual and team strategies. The weighted on base average (wOBA) is well known as a measure of an batter's hitting contribution. However, this measure ignores the game situation, such as the runners on base, which coaches and batters are known to consider when employing multiple hitting strategies. Yet, the effectiveness of these strategies is unknown. This is probably because (1) we cannot obtain the batter's strategy and (2) it is difficult to estimate the effect of the strategies. Here, we propose a new method for estimating the effect using counterfactual batting simulation. To this end, we propose a deep learning model that transforms batting ability when batting strategy is changed. This method can estimate the effects of various strategies, which has been traditionally difficult with actual game data. We found that, when the switching cost of batting strategies can be ignored, the use of different strategies increased runs. When the switching cost is considered, the conditions for increasing runs are limited. Our validation results suggest that our simulation could clarify the effect of using multiple batting strategies.

View full abstract

Download PDF (518K)
Evaluation of soccer players to create scoring opportunities for teammates based on their trajectory prediction

Masakiyo TERANISHI, Kazushi TSUTSUI, Kazuya TAKEDA, Keisuke FUJII

Session ID: 3G4-OS-15b-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3G4OS15b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Soccer is a game in which many players and the ball interact in complex ways. Regarding the quantitative evaluation of soccer attackers, there have been many and few studies on the player with and without the ball, respectively. However, it is still difficult to evaluate an attacking player without the ball and intention to receive it, and to reveal how movement contributes to the creation of scoring opportunities compared to typical (or predicted) movements. In this paper, we evaluate players who create off-ball scoring opportunities by comparing the reference movements generated by trajectory prediction with actual movements. In the proposed method, first, the trajectory is predicted using a graph variational recurrent neural network that can accurately model the relationship between players and predict the long-term trajectory. Next, based on the difference in the existing off-ball evaluation index between the actual data and the predicted trajectory, we evaluate how the actual movement contributes to scoring opportunity compared to the predicted movement as a reference. In the verification, we show that the evaluation of the proposed method is intuitive, using the relationship with the scores with all 18 teams in the Japanese professional soccer league and the example of one game.

View full abstract

Download PDF (815K)
Estimating the quality of group discussions based on ensemble weakly supervised learning

Gai SUZUKI, Shougo OKADA, Hironoki KOU, Yukiko I. NAKANO

Session ID: 3H3-OS-12a-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H3OS12a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this paper, we propose a method to improve the accuracy of a model for estimating the quality of group performance using multi-modal features. We use the group meeting corpus MATRICS, which contains the features of prosody, facial expression, language, and speech turn observed in a total of 56 group meetings. To solve the problem that not all features of all frames and modalities in the time series data are effective for estimating the labels, we propose N-teaching model that is a more robust extension of the weakly supervised co-teaching model for noise labels. In this paper, we propose N-teaching model that is a more robust extension of co-teaching. We also analyze the samples that were not used for training as noise, and compare our results with those of previous studies. We obtained the highest accuracy of MAE 0.309 in the index of Originally (novelty) of the discussion content.

View full abstract

Download PDF (650K)
A Multiparty Model for Estimating Persuasiveness in Group Discussions

Atsushi ITO, Tatsuya SAKATO, Yukiko NAKANO, Fumio NIHEI, Ryo ISHII, At ...

Session ID: 3H3-OS-12a-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H3OS12a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Persuasiveness is an important communication skill in communicating with others. This study aims to estimate the persuasiveness of the participants in group discussions. First, human annotators rated the level of persuasiveness of each of four participants in group discussions. Next, GRU-based neural networks were used to create speech, verbal, and visual (head pose) encoders. The output from each encoder was combined to create a multimodal and multiparty model to estimate the persuasiveness of each participant. The experiment results showed that multimodal and multiparty models are better than unimodal and single-person models. The best performing multimodal multiparty model achieved 80% accuracy in predicting high/low persuasiveness, and 77% accuracy in predicting the most persuasive participant in the group.

View full abstract

Download PDF (554K)
Construction and evaluation of a human-in-the-loop video annotation system for nodding during meetings

Kosuke TOKUHARA, Ko WATANABE, Shoya ISHIMARU, Yutaka ARAKAWA

Session ID: 3H3-OS-12a-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H3OS12a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Annotation is one of the time consuming manipulation for human to do. In our previous work, we made a nodding recognition model using annotated video data made by an annotator. We find that consumption of annotation time and the work load require for the annotator is huge. In this work we come up with the idea of human-in-the-loop video annotation system, which the machine leaning model create annotation result before the annotator finalize the annotation result.With this system we evaluated how average time consumed in annotation is reduced.

View full abstract

Download PDF (391K)
Dialogue act classification using two multi-party discussion corpora

Shunsuke YONEMITSU, Kazutaka SHIMADA

Session ID: 3H3-OS-12a-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H3OS12a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Dialogue act classification is an important task to summarize and analyze discussions. This paper first annotates dialogue act tags to a Japanese multi-party discussion corpus. The tag set is based on an existing multi-party conversation corpus. Then, we propose a multi-dataset learning model for dialogue act classification. In this method, the model is trained from two corpora at the same time. As another approach, we generate a model from the dataset combined from two corpora because the two corpora use the same tag set. We compare the model with multi-dataset learning. The experimental result shows the importance of the corpus size for the task.

View full abstract

Download PDF (327K)
Explainable Models for Predicting Interlocuters’ Subjective Impressions based on Nonverbal Functional Features

When and What Kind of Behaviors Affected Interlocuters’ Impressions?

Shumpei OTSUCHI, Kazuki MIYOSHI, Yoko ISHII, Ryo ISHII, Shin-ichiro EI ...

Session ID: 3H3-OS-12a-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H3OS12a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

An explainable framework is proposed to predict the interlocutors' subjective impressions in group meetings. The goal is to explain when and what kind of nonverbal behaviors affected interlocutors' impressions during meetings. To that end, we formulate a two-fold framework consisting of the regression models of interlocutors' impression scores based on functional head-movement features, followed by the estimation of the temporal distribution of SHAP-based feature contribution, which is obtained with the kernel density estimation of the temporal occurrence probabilities of head-movement functions. The former stage identifies the behaviors related to the impressions, and the later stage suggests the timing of the behaviors, by locating the maximum point of the temporal feature contribution curve, based on an assumption that temporary intensive behaviors lead to form a strong impression. This report shows preliminary results and analyses applied to 4-party 17-group discussions.

View full abstract

Download PDF (712K)
(OS invited talk) Design research of ”Shared Dining" as the convivial factoly in an aged society

Nahoko KUSAKA

Session ID: 3H4-OS-12b-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H4OS12b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, the social problems of isolated elderly people have become apparent due to growth in the population of elderly individuals. New system is necessary to establish increased symbiosis as well as independence for a safer society. The purpose of this study is to examine the initial developments of a new communication system named “Shared Dining System” for dyadic interaction through cooking and eating activities. To solve the problem of isolation in older people, this project will be conducted developing the system through collaboration creatively with all people.

View full abstract

Download PDF (1014K)
Constructing an evaluation space for smile intensity using reference images

Kei SHIMONISHI, Kasuaki KONDO, Junyao ZHANG, Yuichi NAKAMURA

Session ID: 3H4-OS-12b-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H4OS12b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Traditional frameworks of smile detection mainly focus on identifying whether a smile is expressed or not. In contrast, the objective of our research is to determine the degree to which a smile is expressed. The previously developed method of the evaluation of smiling based on the ordinal scales has a limitation of being time consuming due to a large number of comparisons. To overcome the limitation, we propose a method using reference images chosen from an evaluation space of smiling. Once the evaluation space is constructed by a large dataset of smiling, reference images can be chosen based on consistency of its evaluation values with other images. With these reference images, we can evaluate new face images in a short time. We show an example of reference images and evaluated face images using the reference images.

View full abstract

Download PDF (1730K)
Construction of a mountain visitor management system using Open-CV AI cameras

Yutaro TAKEUCHI, Norihisa SEGAWA

Session ID: 3H4-OS-12b-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H4OS12b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Kyoto Sangyo UniversityIn recent years, there have been problems in Japan’s satoyama areas, such as matsutake mushroom thieves entering privately owned mountains without permission. While it is possible to create enclosures to prevent intruders from entering ordinary private land, it is not realistic to enclose them in privately owned mountains.In this study, we propose to install a very small, yet personally identifiable OpenCV AI camera in a concealed location in the mountains, such as along a well-traveled animal trail, in order to detect suspicious persons.In this paper, we propose and show the implementation of a prototype mountaineering management system using OpenCV AI cameras to easily detect suspicious persons in the mountains. In addition, an evaluation of this system will be conducted to demonstrate the effectiveness of this system.

View full abstract

Download PDF (683K)
The change of participants’ perspectives on group discussion at an early stage

Mika NAKANO, Masaki SHUZO, Motoki SAKAI, Masahide YUASA

Session ID: 3H4-OS-12b-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H4OS12b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Although group discussion (GD) skills are essential for human interactions, few studies have investigated proper interventions for novice students. Participants’ recognitions of what happens in GD can be an important cue to grasp what they pay attention to. This paper analyzes the change of participants’ perspectives within the three domains of GD based on previous research: (1) Individual, (2) Group, and (3) Knowledge. Ten GD sessions were carried out with four or five participants from the same university. The participants filled out a questionnaire that included free writing descriptions, and their responses were analyzed for content. A comparison between the first and last five sessions showed that “Listening skill” within the (1) Individual domain decreased while “Summary skill” in the (2) Group domain increased in the latter sessions. These results show the participants changed their perspectives from individualistic to group interactive as they accumulated experience, suggesting the benefits of GD.

View full abstract

Download PDF (320K)
Analysis of Communication Skills by Repeated Reflections on Group Discussions for First- and Second-Year University Students

Masahide YUASA

Session ID: 3H4-OS-12b-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3H4OS12b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A reflection after discussion by participants is expected to enhance their discussion skills. In our previous work, we proposed to improve novice students' discussion skills using repeated discussion-and-reflection sequences. In this paper, we investigated how the students improved their discussion skills through long-term experiments. Participants were four novice students who were not familiar with each other at the first session. Experimental discussions using an online video meeting system were repeated ten times during three months in 2021. The discussions were conducted with the same students. In our analysis, students tried to increase the number of utterances and continue a talk by observing others' behaviors. Our study will provide an effective method to improve student discussion skills, and support to develop an advice system for students' behaviors in discussion.

View full abstract

Download PDF (732K)
Difficulties in continuous physiological sensing in-the-wild

Yutaka ARAKAWA, Yugo NAKAMURA, Yuki MATSUDA

Session ID: 3I3-OS-5a-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I3OS5a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

To realize ubiquitous services based on physiological sensing, it is important to assume that commercially available sensors will be used in daily life. Collecting physiological data from ordinary people for a long period of time in-the-wild is completely different from performing highly accurate physiological sensing using dedicated sensors in a controlled laboratory environment. In this presentation, we will share the various problems we have experienced and clarify the points that need to be considered in order to achieve stable sensing in a real environment.

View full abstract

Download PDF (1137K)
A Study of Automatic Selection Algorithm for Optimal Attachment Position of Patch Type Wireless R-R Interval Telemeter

Aoi NOGUCHI, Tomoyuki TAKANO, Toshitaka YAMAKAWA

Session ID: 3I3-OS-5a-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I3OS5a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Heart rate variability (HRV) is an indicator of changes in the interval between continuous R waves (R-R interval; RRI) on the electrocardiogram (ECG) caused by autonomic nervous system activity. Measurement of the RRI is useful for detecting diseases related to autonomic nervous system activity and for predicting seizures. This study aimed to improve a heart rate measurement system that combines a highly accurate, compact, and inexpensive patch-type R-R interval telemeter and a smartphone application that automatically selects a suitable measurement position for non-experts. To evaluate the measurement accuracy, RRIs of 10 healthy male and 10 healthy female subjects in four postures (supine, sitting, standing, and walking (3 km/h)) were measured simultaneously using the system and a reference ECG measurement system, and the results were compared. R-wave detection rate and Bland-Altman analysis analyzed the measurement accuracy of this system. The accuracy showed the measurement accuracy was sufficient for HRV analysis.

View full abstract

Download PDF (1032K)
Development of a drowsy driving detection method based on self-attention autoencoder using RR interval data

Kentaro HORI, Hiroki IWAMOTO, Koichi FUJIWARA, Manabu KANO

Session ID: 3I3-OS-5a-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I3OS5a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Drowsy driving is a problem that needs to be solved because it can lead to serious traffic accidents. Heart rate variability (HRV), which is a fluctuation of RR interval (RRI) in electrocardiogram, is expected to be practical input data for drowsy driving detection since it can be measured easily using wearable devices. In this study, a new driver drowsiness detection method using raw RRI time series as input instead of extracting HRV features was proposed. The proposed method is an anomaly detection method based on autoencoder and self-attention. As a result of an experiment using a driving simulator, the proposed method recorded the true positive rate of 0.80 and the false positive rate of 0.12, which were superior to those of methods using HRV features as inputs. This result suggests that raw RRI time series may be more suitable as inputs than HRV features.

View full abstract

Download PDF (462K)
Development of severality diagnosis model of atrial fibrillation using XGBoost from electrocardiogram

Tetsuma KAWAJI, Shoya NAGATA, Koichi FUJIWARA

Session ID: 3I3-OS-5a-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I3OS5a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Atrial fibrillation (AF) is a type of arrhythmia in that atria fail to adequately function. Frequent AF may lead to the formation of blood clots in the atria, which may lead to cerebral or myocardial infarction. AF is usually diagnosed by a cardiologist with a visual check of electrocardiogram (ECG) data measured with a Holter electrocardiograph over a 24-hour period, which is burdensome and time-consuming. In this study, we develop a model that automatically diagnoses the severity of AF from ECG data by using machine learning technologies. The ECG data of 75 patients with suspected AF were collected from the Mitsubishi Kyoto Hospital. The 30-beats RRI data were clipped from the collected ECG data, and heart rate variability (HRV) data were extracted from the clipped RRI data. We trained an XGBoost model that can diagnose normal, mild, and severe AF using the extracted HRV data. The overall rate of correct answers of the trained model was 86.2%, 1.53% of severe AFs misdiagnosed healthy or mild as severe, and 11.4% misdiagnosed severe as healthy. The performance was high enough for clinical practice. The severity of AF will be easily and rapidly diagnosed with the developed model in the future.

View full abstract

Download PDF (353K)
Research on Self-Evaluation Scale and Emotional Response to Visual Stimuli

Kazuki TSURUMAKI, Koki HAYAFUNE, Kazuaki OHMORI, CHIE HIEIDA, Takayuki ...

Session ID: 3I3-OS-5a-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I3OS5a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In the analysis of emotional responses, it is important to consider the characteristics and states of the subjects. We might be able to gain new results by conducting an analysis that takes into account individual states and characteristics, rather than the emotional response alone. In this study, we investigate the correspondence between emotions and individual states to analyze emotional responses in consideration of individual characteristics. We performed a correlation analysis between emotional responses to visual stimuli and self-assessment scales showing the individual characteristics. As a result, there was a correlation between EDA and the self-assessment scale. It was suggested that the acuity of the emotional response to the stimulus affect the awareness of the interoception and the directivity of the emotional feeling.

View full abstract

Download PDF (399K)
Long-term prognostic classification of West syndrome based on scalp EEG using phase-amplitude coupling

Tatsuki SAITO, Koichi FUJIWARA, Jun NATSUME, Ryosuke SUZUI

Session ID: 3I4-OS-5b-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I4OS5b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

West syndrome (WS), an infantile epileptic encephalopathy defined on the basis of epileptic spasms and hypsarrhythmia on the Electroencephalogram (EEG), is recognised to have a very poor long-term prognosis in terms of spasm control, freedom from other seizure types and developmental arrest.WS is an important clinical problem for patients and patients' families because of its poor developmental prognosis; however, the pathophysiological of WS have not been fully understood in spite of extensive work by many investigators. Accurate biomarkers of WS for the evaluating the effect and prognosis of treatment is needed.To predict the long-term prognosis of WS after the treatment, we used two deep learning models with the EEG in which High-frequency Oscillations(HFO) were appearing as input.The highest Micro-average accuracy rate was found to be 78\% , and Macro-average accuracy of 64\% was obtained from each subject.

View full abstract

Download PDF (730K)
Resting-state brain activity predicts neurofeedback training aptitude

Takashi NAKANO, Masahiro TAKAMURA, Haruki NISHIMURA, Maro MACHIZAWA, N ...

Session ID: 3I4-OS-5b-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I4OS5b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Neurofeedback (NF) training has been developed as a promising novel treatment of brain psychiatric disorders. However, NF aptitude, an individual's ability to change brain activity through NF training, has been reported to vary significantly among different individuals. In the present study, we applied machine learning to resting-state functional magnetic resonance imaging (fMRI) data for the prediction of NF aptitude. We trained the multiple regression models to predict the individual NF aptitude scores from the resting-state functional brain connectivity (FC) data. As result, we identified six resting-state FCs that predicted NF aptitude and succeeded in the prediction of NF aptitude. The identified FC model revealed that the posterior cingulate cortex and posterior insular cortex were the functional hub and formed predictive resting-state FCs, suggesting that NF aptitude may be involved in the attentional mode-orientation modulation system's characteristics in task-free resting-state brain activity.

View full abstract

Download PDF (331K)
Phasing of epileptic seizure spreading and extending using hidden Markov model

Shuji KOMEIJI, Toshiki ORIHARA, Takumi MITSUHASHI, Hidenori SUGANO, To ...

Session ID: 3I4-OS-5b-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3I4OS5b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper argues the phasing of epileptic seizure spreading and extending using the hidden Markov model (HMM). The intracranial electroencephalography during an epileptic seizure is thought to have several phases with time transitions. In this paper, we identified clinically interpretable phases in all 30 cases of epileptic seizures by unsupervised learning of HMM. Data-driven discovery of phases may contribute to understanding the mechanisms of epileptic seizure onset to settle and lead to new treatments.

View full abstract

Download PDF (1155K)
Local Search with Multi start for Hyperparameter Optimization of Deep Learning

Shintaro TAKENAGA, Masaki ONISHI

Session ID: 3J3-OS-3a-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J3OS3a01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In deep learning, hyperparameters can severely affect the learning model performance. Hyperparameter optimization (HPO) is one of the promising techniques to maximize the performance of the learning model. It has been reported that the Nelder-Mead method shows superior performance to other optimization methods in the HPO of deep learning. However, the Nelder-Mead method may converge bad local minima because this method is a local search heuristic using a simplex. This problem may be tackled using multi start to start from the different initial values. In this paper, we investigate the effectiveness of the multi start in several HPO problems of deep learning. The results show that the search performance of the Nelder-Mead method is improved by applying the multi start.

View full abstract

Download PDF (507K)
Hyperparameter Optimization by Multi-objective Bayesian Optimization based on Inference of User Preference

Ryota OZAKI, Yusuke TAKAGI, Masayuki KARASUYAMA, Ichiro TAKEUCHI

Session ID: 3J3-OS-3a-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J3OS3a02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

AutoML considers hyper-parameter optimization (HPO) of machine learning models. However, there often exist multiple evaluation indices for the learned models. For example, both model accuracy and memory size can be objective functions, which are typically in the trade-off relation. In this case, the importance of each objective function depends on the user preference. To incorporate the preference adaptively into HPO, we propose a preference-learning-based multi-objective Bayesian optimization (PL-MBO) method. Since directly specifying the exact preference can be difficult for the user, PL-MBO considers only querying a `relative preference’ that the user can give much easier. By combining a Bayesian user preference model and the standard Gaussian process model of objective functions, the expected improvement criterion of the user preference is derived. Our numerical experiments show that the optimal solution based on the user preference can be found efficiently in HPO for neural networks.

View full abstract

Download PDF (446K)
Exploring optimized semi-supervised learning using knowledge transfer graphs

Yoshitaka MURAMOTO, Tsubasa HIRAKAWA, Takayoshi YAMASHITA, Hironobu FU ...

Session ID: 3J3-OS-3a-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J3OS3a03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Π-model is a consistency-based, semi-supervised learning (SSL) method that can be derived from other conventional methods by devising main components such as data augmentations and models. Also, FixMatch combines conventional data augmentation methods with pseudo-labeling to achieve higher accuracy. The structures of these SSL methods were designed by humans and may not be the best learning method. In this paper, we aim to explore a new SSL method that contains the conventional methods. We introduce consistency loss, pseudo-labeling, and other main components of conventional methods into the knowledge transfer graph that contains mutual learning, and explore the graph structure to obtain the new SSL method from various SSL methods. From the explore and evaluation experiments using various datasets such as CIFAR-100, we confirmed that our method is more accurate than the conventional SSL methods.

View full abstract

Download PDF (721K)
An Impact of Weight Initialization on Model Evaluations in Neural Architecture Search

Nozomu YOSHINARI, Shinichi SHIRAKAWA

Session ID: 3J3-OS-3a-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J3OS3a04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Architecture is one key factor determining neural networks' performance, and neural architecture search, which aims at finding competent architectures without human effort, is one of the most intensive research areas of automated machine learning. While most papers in the area focused only on architecture, recent research show performance of architecture depends on other hyperparameters such as learning rate, and simultaneous optimization of them is needed to obtain a better model. This research focuses on weight initialization methods and investigates their impact on the performance of architectures after training. Through experiments on the architectures defined in NAS-Bench-201, we found an initialization method considering architecture significantly improved the performance of many models.

View full abstract

Download PDF (570K)
Model Size Constrained Optimization of DARTS in Neural Architecture Search

Kazuki HENMI, Masaki ONISHI

Session ID: 3J3-OS-3a-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J3OS3a05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Deep learning, a machine learning method, has been applied in a variety of fields such as natural language processing and image recognition due to its high performance. AutoML, a method to automate machine learning, has been widely studied, and Neural Architecture Search (NAS), a method to automatically optimize neural network models according to data and objectives, plays a very important role. NAS can find a very accurate model using DARTS, which uses the gradient method to search. Generally, DARTS only optimizes accuracy, which improves recognition accuracy but also increases the amount of memory needed for the model. However, there is a limit to the quantity of memory that can be loaded when using deep learning on mobile devices and embedded systems. In this paper, we propose a method to search for a network model that considers accuracy and model size by adding constraints to DARTS. As a result, the proposed method enables us to search for network models with high accuracy in constraint conditions.

View full abstract

Download PDF (443K)
Stopping criterion for Neural Architecture Search

Kotaro Sakamoto SAKAMOTO, Hideaki ISHIBASHI, Rei SATO, Shinichi SHIRAK ...

Session ID: 3J4-OS-3b-01
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J4OS3b01

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Neural architecture search (NAS) is a framework for automating the design process of a neural network structure. While the recent one-shot approaches have reduced the search cost, there still exists an inherent trade-off between cost and performance. It is important to appropriately stop the search and further minimise the high cost of NAS. On the other hand, heuristic early-stopping strategies have been proposed to overcome the well-known performance degradation of the one-shot approach, particularly differentiable architecture search (DARTS). In this paper, we propose a more versatile and principled early-stopping criterion on the basis of the evaluation of a gap between expectation values of generalisation errors of the previous and current search steps with respect to the architecture parameters. The stopping threshold is automatically determined at each search epoch without cost. In numerical experiments, we demonstrate the effectiveness of the proposed method. We stop the one-shot NAS algorithms such as ASNG-NAS and DARTS and evaluate the acquired architectures on the benchmark datasets: NAS-Bench-201 and NATS-Bench. Our algorithm has been shown to reduce the cost of the search process while maintaining a high performance.

View full abstract

Download PDF (714K)
Efficient Search of Multiple Architectures in Structure Complexity Aware Neural Architecture Search

Yuhei NODA, Shota SAITO, Shinichi SHIRAKAWA

Session ID: 3J4-OS-3b-02
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J4OS3b02

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In neural architecture search (NAS) that searches the architectures of deep neural networks, methods for taking into account not only the prediction performance but also metrics related to the architecture complexity have been developed. This study aims to speed up the NAS method for optimizing the objective function defined as a weighted sum of two metrics such as the performance and number of parameters. The proposed method is based on one-shot NAS and optimizes the weight parameters in a super network only once. Then, we define multiple distributions for generating architectures with different complexity and update multiple distributions by utilizing samples from these mixture distributions based on importance sampling. In this way, we can obtain multiple architectures with different complexity in a single search and reduce the search cost. We apply the proposed method to the architecture search of convolutional neural networks and show that multiple architectures with different complexity can be obtained with less computational cost than the existing methods.

View full abstract

Download PDF (324K)
Model Reduction Effect of NAS during Finetuning of ViT

Xinyu ZHANG, Sora TAKASHIMA, Rio YOKOTA

Session ID: 3J4-OS-3b-03
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J4OS3b03

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In image recognition, Vision Transformers (ViT) have achieved the State-of-the-Art in image classification on ImageNet. However, the models are becoming so large that they cannot even fit on a single GPU, which limits their usefulness during the inference. In order to reduce the size of such large vision transformer models, we utilize the AutoFormer proposed by Chen et al. In the original work on AutoFormer, the supernet is trained from scratch. In this work, we proposed a method that trains the supernet of AutoFormer from a pre-trained vision transformer, which is followed by an architecture search during fine-tuning. We find that for the same number of parameters, the classification accuracy is superior to the models trained from scratch.

View full abstract

Download PDF (387K)
Exploring Token-Mixing Structure for Transformer

Takuya ASAKURA, Kuniaki UTO, Koichi SHINODA

Session ID: 3J4-OS-3b-04
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J4OS3b04

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The Transformer model, which applies Channel-Mixing and Token-Mixing alternately to input data, has been developed for time-series data such as text and speech. Recent studies have shown that this model can also perform well image. Various improved models of transformers have been proposed for image processing, many of which have improved the structure of the fully connected layer, especially for Token-Mixing. However, these structures should be designed manually, which requires advanced knowledge about the characteristics of the target data. In this paper, we propose a method to automatically acquire Token-Mixing structures by learning the relationships between Tokens. In our experiments on the image classification tasks, the structure obtained by the proposed method achieves higher accuracy while having fewer parameters than the other Token-Mixing methods. We also visualized the Token-Mixing structures obtained by the proposed method, and observed that the proposed method tends to focus on spatially close Tokens.

View full abstract

Download PDF (1110K)
Neural Architecture Search for Transformers on Vision and Language Tasks

Masanori SUGANUMA

Session ID: 3J4-OS-3b-05
Published: 2022
Released on J-STAGE: July 11, 2022

DOIhttps://doi.org/10.11517/pjsai.JSAI2022.0_3J4OS3b05

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Since Transformer was first proposed, it has shown remarkable performance in a wide range of fields such as image recognition, natural language processing, and their fusion tasks. In general, the network structure of deep neural networks has a significant impact on its performance, and Transformer is no exception. However, the structure of Transformer has not been explored sufficiently due to the high training cost, and thus its potential has not been fully exploited. In this paper, we first design a search space that can represent various Transformer architectures. We then propose a search method that can efficiently search the architectures in the search space. We evaluate our method on several vision and language tasks and show experimentally that the Transformers found by the search outperform the vanilla Transformers. Moreover, we provide what architecture components are important for the Transformer's performance by analyzing the architectures obtained by the search.

View full abstract

Download PDF (566K)

Register with J-STAGE for free!