-
Kensuke MIYAMOTO, Norifumi WATANABE, Yoshiyasu TAKEFUJI
Session ID: 2J1-GS-8a-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In human's cooperative behavior, there are some strategies: a passive behavioral strategy based on others’behaviors and an active behavioral strategy based on the objective-first. However, it is not clear how to acquire a meta-strategy to switch those strategies. In this study, we conduct a collision avoidance experiment with agents taking multiple strategies in a grid-like corridor to see whether subject's behavior changes when agent's strategy changes. We compare the behavior selected by the subjects with the behavior of the agents acquired by reinforcement learning. The experimental results show that subjects can read the change in strategy from the behavior of the oncoming agent.
View full abstract
-
Arisa UEDA, Magassouba ALY, Tubasa HIRAKAWA, Takayoshi YAMASHITA, Hiro ...
Session ID: 2J1-GS-8a-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Placing everyday objects in designated areas, such as placing a glass on a table, is a crucial task for Domestic service robots (DSRs). In this paper, we propose a physical reasoning method about collisions in placement tasks. The proposed method, Transformer PonNet, predicts the probability of a possible collision and visualizes areas involved in the collision. Unlike existing methods, Transformer PonNet can be applied to objects whose models are unavailable. We propose a novel Transformer Perception Branch that handles relationships among features more complex than simple self-attention. We built simulation and physical datasets using a DSR, and validated our method on the datasets. We obtained an accuracy of 82.5% for the physical dataset.
View full abstract
-
Yuto HARADA, Takayuki NAGAI, Takato HORII, Tatsuya AOKI
Session ID: 2J1-GS-8a-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
The object manipulation requires accurate recognition of its category, position, and posture. However, unacceptable pose estimation errors may occur when the robot observes the object from an inadequate viewpoint. It is necessary to observe the object considering the viewpoint where the pose estimator can estimate its posture accurate enough to solve this problem. This paper proposes an viewpoint exploration method for accurate object pose estimation under the assumptions that the object category and position are known. The proposed method is implemented and evaluated using Gazebo simulator. The experimental results show that the proposed method can reduce the estimation error and outperform the exploration method in which the robot moves straight in the direction of the object.
View full abstract
-
Sawako TAJIMA, Shuzo KOYAMA, Satoshi KURIHARA
Session ID: 2J1-GS-8a-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Inferring our behavior and/or intentions is ultimate ability for autonomous interactive robots. In this paper, we propose an action selection network based multi agent planning that flexibly determines its own actions even when others intervene. The proposed method is an extension of the conventional multi -agent planning, which enables both guessing the internal states of others and planning one's own actions. We conducted an experiment using a simple block task to examine how one's own action sequence changes depending on the intervention of others. As a result, it is verified that our proposed methodology have high adaptability for unexpected reaction of users.
View full abstract
-
Masatoshi NAGANO, Tomoaki NAKAMURA, Takayuki NAGAI, Daichi MOCHIHASHI, ...
Session ID: 2J3-GS-8b-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Humans recognize perceived continuous high-dimensional information by dividing it into significant segments such as words and unit motions. We believe that such unsupervised segmentation is also an important ability for robots to learn topics such as language and motions. To this end, we have been proposed the Hierarchical Dirichlet Processes-Variational Autoencoder-Gaussian Process-Hidden Semi-Markov Model (HVGH) which is composed of a deep generative model and a statistical model. HVGH can extract features from high-dimensional time-series data by VAE while simultaneously dividing it into segments by Gaussian process. In this paper, we propose a method that can segment not only high-dimensional time-series data but also videos in an unsupervised manner by improving VAE of HVGH to Convolutional VAE. In an experiment, we used a first-person view video of an agent in the maze to demonstrate that our proposed model estimates more accurate segments than the baseline method.
View full abstract
-
Takumi HIRAKAWA, Masatoshi NAGANO, Tomoaki NAKAMURA
Session ID: 2J3-GS-8b-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Human infants can learn phonemes and words from continuous speech signals which have a double articulation structure without correct labels. In addition to speech signals, time-series data with multiple articulation structures also exist in our environment, and learning such structures is also important for realizing robots that can autonomously adapt to the environment. To this end, The nonparametric Bayesian double articulation analyzer (NPB-DAA) has been proposed as a method for learning the double articulation structure in an unsupervised manner. However, since this method composed of a two-level hierarchical statistical model, it cannot deal with time-series data with more than two articulation structures. In this paper, we propose a statistical model that can learn time series data with multiple articulation structures. We also present the results of preliminary experiments using speech signal data.
View full abstract
-
Kazutoshi SHINODA, Yuki TAKEZAWA, Masahiro SUZUKI, Yusuke IWASAWA, Yut ...
Session ID: 2J3-GS-8b-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Instruction following is a task for learning to transform natural language instructions into a sequence of actions in visual environments. Recently, an interactive instruction following task has been proposed to encourage research in following natural language instructions that require interactions with objects. We observe that an existing model for this task is not robust to variations of objects and instructions, which may cause a serious problem in real-world applications. We assume that this is due to the high sensitiveness of neural feature extraction to small perturbations in vision and language. We propose a Neuro-Symbolic approach to mitigate the lack of robustness. Concretely, we introduce object detection and semantic parsing modules to this task and make reasoning over symbolic features feasible. Our experiments on the ALFRED dataset show that our approach significantly improves the performance on subtasks that require object interactions.
View full abstract
-
Akira TANIGUCHI, Hiroaki MURAKAMI, Tadahiro TANIGUCHI
Session ID: 2J3-GS-8b-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In language acquisition, it is known that infants are able to acquire phonemes and words by using statistical cues contained between speeches. In addition, it has been pointed out that infants also utilize co-occurrence with objects in the environment. In this study, we propose an unsupervised phoneme and word discovery method that utilizes the co-occurrence of phonological information and object information. The proposed method is based on Nonparametric Bayesian Double Articulation Analyzer (NPB-DAA), which is a phoneme and word discovery method from phonological features, and Multimodal Latent Dirichlet Allocation (MLDA), which is an object categorization method for multimodal information obtained from objects. We evaluate the effect of using co-occurrence cues for the discovery of words representing objects.
View full abstract
-
Takehiro AOSHIMA, Takashi MATSUBARA, Takaharu YAGUCHI
Session ID: 2J3-GS-8b-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Satisfying underlying physical laws such as the conservation law of energy is important for physical simulations. Recent studies proposed neural networks that enable a physics simulation with the conservation law of energy. They used numerical integrators such as symplectic integrators or discrete gradient methods for conserving energy. Their approaches depend on the canonical momentum or the velocity. However, obtaining the accurate velocity is difficult because of measurement errors, and their predicted states are greatly different from the real-world physical system. In this paper, we propose a neural network based on discrete-time Lagrangian mechanics, which learns the dynamics only from the position data and conserves the energy. For conserving energy strictly in discrete time, we use discrete gradient methods. For evaluating our approach, we employ physical systems such as a mass-spring system, a pendulum system, and a 2-body system. We show our approach conserves the total energy strictly.
View full abstract
-
Yuki TAKEZAWA, Kazutoshi SHINODA, Masahiro SUZUKI, Yusuke IWASAWA, Yut ...
Session ID: 2J4-GS-8c-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Instruction following is one of the most important problems to operate a robot in human space. However, it is still an open problem that a robot can not accomplish complex tasks such as housework. Since a task consists of multiple skills, if a representation of a task that expresses this compositionality can be obtained, the robot may be able to accomplish such a complex task. To this end, recently Compositional Plan Vectors (CPVs) were proposed and achieved high task success rate in complex tasks that consist of many skills. However, previous CPVs can not be applied to instruction following because the observation at the goal state is not unique. In this work, following the CPVs, we proposed the method to obtain the compositional task representation in instruction following. Experimentally, we show that our method can improve the task success rate.
View full abstract
-
Eiji UCHIBE
Session ID: 2J4-GS-8c-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Reinforcement learning algorithms are categorized into model-based methods, which explicitly estimate an environmental model and a reward function, and model-free methods, which directly learn a policy from real or generated experiences. We have proposed the parallel reinforcement learning algorithm for training multiple model-free and model-based reinforcement learners. The experimental results show a simple algorithm can contribute to complex algorithms' learning. However, since each learner's computation time was not considered, we could not fully demonstrate the advantage of using a simple model-free reinforcement learner. This paper proposes an asynchronous parallel reinforcement learning method that considers the differences in control frequencies. The main contribution is separating the replay buffers collected by each learner and transforming the experience replay buffer to absorb the difference in control frequencies. The proposed method is applied to benchmark problems and compared with the case without considering the difference in control frequencies. The results show that the proposed algorithm selected the simple model-based method with a short control frequency in the early stage of learning, the complex model-based method in the middle stage of learning, and the model-free method in the late learning stage.
View full abstract
-
Takuma NISHIMURA, Masatoshi NAGANO, Tomoaki NAKAMURA
Session ID: 2J4-GS-8c-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Humans learn the names of objects by associating words to objects. It has been reported that joint attention, which is an ability to identify the target object, facilitates the acquisition of word meaning. We believe that this ability is also important for robots to flexibly acquire new words in the daily environment through interaction with humans. In this paper, we propose an algorithm that enables robots to learn word meanings in a cluttered scene by identifying the target object utilizing joint attention and co-occurrence of words and objects. In the proposed algorithm, a robot detects multiple objects using a region proposal network and selects one of them based on joint attention and the co-occurrence of words and objects. Finally, the robot acquires the word meaning by associating the word to the selected object by multimodal latent Dirichlet allocation.
View full abstract
-
Kazuma FURUKAWA, Akira TANIGUCHI, Yoshinobu HAGIWARA, Tadahiro TANIGUC ...
Session ID: 2J4-GS-8c-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Multi-agent multimodal categorization method for modeling symbol emergence was proposed in a previous study. However, the model regarded signs, i.e., words, as priors of latent variables representing object categories and made the model incompatible with pre-existing multimodal categorization models, which regarded signs as observations, i.e., leaf nodes of the probabilistic graphical models. This study proposes a new model by modifying the tail-to-tail connection for the variable corresponding to signs in the previous model to the head-to-head connection. In the experiment, we compared the previously proposed model and ours. Experimental results show that the performance of the modified model is equivalent to that of the original model.
View full abstract
-
Naruya KONDO, Yusuke IWASAWA, Yutaka MATSUO
Session ID: 2J4-GS-8c-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Prior works show the power of modeling how the world evolves through time and locations, a.k.a., world models. The key of the world model is that the model is learned from data. However, few studies discuss how to collect good data for learning good world models; Most prior works use either purely random policy or expert policy for collecting the data. The former may not effectively cover the data from the world of interests, and the latter is cumbersome to collect. To this end, this paper investigates the potential to leverage the concept of "skill" into collecting good data for learning world models. Our method train world models via the data from exploration policy based on the skill embedding, which is learned from the data simulated using the current world models. As the skill learned in fully unsupervised manner, our methods does not rely on any data from experts, but can explore the worlds more than random policy. Our method collect the data using the skill embedding that is learned in unsupervised manner from the data simulated using the current world models, and the learned skill is then used to collect the data for training world models. Empirical results on the Mujoco simulator show our method can acquire better world models with fewer data than random policy.
View full abstract
-
Rina KOMATSU, Tad GONSALVES
Session ID: 2N1-IS-2a-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
This study deals with photographic face to conditional artistic face illustrations in the form of portrait, anime and emaki following the content of conditional input. Different from cityscape to segmentation task, face to illustration translation task requires large texture changing especially translation to anime face which includes characteristic edges and shapes. Related works try mapping between domains with a large number of varying features. However, incorporating more modules for adopting geometric change level translation learning and reusing Generators for keeping cycle consistency exorbitantly increases the computational cost of model training. Our study aims to establish conditional translation model which has the potential to learn diverse and large feature mappings using only a small number of training parameters. We developed Multi-CartoonGAN employing central biasing normalization as conditional input and adaptive layer instance normalization to make translation learning robust. As can be seen from our translation learning and test demonstration, our model greatly reduces the computational cost of parameter training and performs conditional translation even when the target domain has features quite different from the real-world face.
View full abstract
-
Naoki NONAKA, Jun SEITA
Session ID: 2N1-IS-2a-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In the medical field, it is quite expensive to obtain labeled data that are essential to train deep neural networks (DNNs). One way to tackle this issue is to apply data augmentation, a technique to improve classification accuracy by increasing diversity of data through random but realistic transformations. Data augmentation have shown promising results in visual domain, however, transformations applied to image data cannot be directly applied to ECG data. Here we propose RandECG, a data augmentation method tailored for electrocardiogram (ECG) data classification with deep neural networks (DNNs). We explored various transformation methods and selected suitable transformations for ECG. We tested efficacy of RandECG on two different datasets, and found that the classification accuracy of atrial fibrillation can be improved up to 3.51%, without changing an architecture of DNNs.
View full abstract
-
Masanori HIRANO, Kiyoshi IZUMI, Hiroki SAKAJI
Session ID: 2N1-IS-2a-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper proposes a new model to reverse engineer and predict traders' behaviors for financial market. In this model, we used an architecture based on the transformer and residual block, and a loss function based on Kullback-Leibler divergence. In addition, we established a new evaluation metric, and consequently, succeeded in constructing a model that outperforms conventional methods and has an efficient architecture. In the future, we will build a model with higher performance and versatility. Moreover, we will introduce this model to financial simulations.
View full abstract
-
Jenq-Haur WANG, Hsin-Wen LIU
Session ID: 2N1-IS-2a-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Stock markets are usually affected by many factors which makes it very challenging to predict. Since there’s more information available, such as stock prices, company revenues, news reports, and technical indicators, it is common to predict stock trends using machine learning and deep learning models. In this paper, we combine news content with stock prices using fusion models for stock trend prediction. First, we utilize Long Short-Term Memory (LSTM) to learn sequential information from stock prices. Then, we combine Hybrid Attention Networks (HAN) to discover the relative importance of words from news reports to improve stock trend prediction. The experimental results show that the best macro-F1 score of 79.0 % can be achieved when we combine news content and stock prices. As compared to individual models, the performance improvement of up to 40% can be obtained. This shows the potential of our proposed approach.
View full abstract
-
Jiun-yi TSAI, Jia-Ying SHIH
Session ID: 2N1-IS-2a-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In this paper, we explored factors that tend to increase the number of enterovirus infections. We use government open data and data-mining techniques such as linear regression, random forest, support vector machine, and gradient boosting implemented by the XGBoost package to predict the enterovirus epidemic in Taipei and Taoyuan next week. The R-squared (also known as the coefficient of determination) of the best performing predictive model is about 0.9, showing that we can effectively predict the enterovirus epidemic through machine learning models.
View full abstract
-
Shihori TANABE, Ryuichi ONO, Horacio CABRAL, Sabina QUADER, Ed PERKINS ...
Session ID: 2N3-IS-2b-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Molecular networks affect the responsiveness of diseases to therapeutics. (1) The objective of the study is to identify the molecular networks related to therapeutic responsiveness in diseases. Epithelial-mesenchymal transition (EMT) and cancer stem cells (CSCs) are involved in drug resistance in cancer, and share some molecular characteristics. To reveal the molecular networks responsible for cancer malignancy, gene expression and molecular networks in diffuse-type gastric cancer (GC), which is resistant to anti-cancer drugs, and intestinal-type GC were analyzed. Since the involvement of RNA viral network was identified in GC, the molecules and causal networks in RNA viral networks, as well as in diffuse- and intestinal-type GC were explored. CSC-related networks included glioblastoma multiforme signaling pathway. (2) Outline of the conclusions of the results: Using AI methods, we generated the candidate models including Elastic-Net Classifier (L2 / Binomial Deviance) (Cross Validation score LogLoss 0.3839, AUC 0.9037) and eXtreme Gradient Boosted Trees Classifier (Cross Validation score LogLoss 0.2647, AUC 0.9565) that can distinguish the differences between diffuse- and intestinal-type GC using molecular network data. The alteration in molecular networks may affect the therapeutic responsiveness.
View full abstract
-
Takato Yasuno YASUNO, Hiroaki SUGAWARA, Junichiro FUJII, Ryuto YOSHIDA
Session ID: 2N3-IS-2b-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In 2021, Japan recorded more than three times as much snowfall as usual, so road user maybe come across dangerous situation. The poor visibility caused by snow triggers traffic accidents. At the night time zone, the temperature drops and the road surface tends to freeze. CCTV images on the road surface have the advantage that we enable to monitor the status of major points at the same time. Road managers are required to make decisions on road closures and snow removal work owing to the road surface conditions even at night. In parallel, they would provide road users to alert for hazardous road surfaces. This paper propose a method to automate a snow hazard indicator that the road surface region is generated from the night snow image using the Conditional GAN, pix2pix. In addition, the road surface and the snow covered ROI are predicted using the semantic segmentation DeepLabv3+ with a backbone MobileNet, and the snow hazard indicator to automatically compute how much the night road surface is covered with snow. We demonstrate several results applied to the cold and snow region in the winter of Japan January 19 to 21 2021, and mention the usefulness of high similarity between snowy night-to-day fake output and real snowy day image for night snow visibility.
View full abstract
-
Jianming HUANG, Zhongxi FANG, Hiroyuki KASAI
Session ID: 2N3-IS-2b-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
For graph classification tasks, graph kernels based on the R-convolution framework are effective tools which aims to decompose graphs into substructures. However, the current R-convolution framework has a weak point that its aggregating strategy of substructure similarities is too simple, which is based on unweighted summation and multiplication of substructure similarities. This means that it may have less robustness. In our works, we tend to combine the Bag of Feature (BoF) model and the Adjacent Point Pattern to form a more effective framework for graph key feature extraction, which also supports large datasets.
View full abstract
-
Taisei NARAHA, Kouta AKOMOTO, Ikuko Eguchi YAIRI
Session ID: 2N3-IS-2b-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
As deep learning research develops and models become larger and more complex, there are increasing concerns to deep learning about its' low ability of explanations to humans and it's blackbox characteristics. Model visualization research of DL has been attracting attention for a decade for a solution of this concern. VR interaction research between humans and models is a practical mean of the model visualization research which has great potential but is still in the early stages. The purpose of our study is to propose new methods for VR technology to contribute to the development of deep learning models by investigating and implementing the visualization technology of deep learning and VR research projects. In this paper, we also report our two experimental results. One is a web survey using a PC application and a demo movie VR goggles application, and the other is an evaluation experiment using VR goggles.
View full abstract
-
Kota ISHIZUKA, Kai KUROGI, Kosuke KAWAKAMI, Daishi IWAI, Kazuhide NAKA ...
Session ID: 2N3-IS-2b-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
The standard way to create text ads is to capture searched keywords and the information on their landing pages (LP). However, this coupling of keywords and an LP increases the number of text ads per LP, which makes impossible to create ad texts for all effective combinations of keywords and LPs due to limitation of human resource. We propose a transformer-based ad text generation model using both keywords and LPs to reduce costs and time generating ad texts. We extract tags and texts in LP’s HTML, such as title, h1, h2, fine-tune a pre-trained encoder-decoder model (initialized by BERT2BERT), and HTML tag embeddings, similar to position embeddings, are passed to an input layer. The experimental results demonstrate that our model generates ad texts with a quality close to human-written ones for fluency, attractiveness, and correctness.
View full abstract
-
Basanta Raj GIRI, Junya MORITA, Thanakit PITAKCHOKCHAI
Session ID: 2N4-IS-2c-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In the current highly developed information society, having a habit of rumination can be dangerous for mental health. We built a system integrating the ACT-R cognitive model and nudge to prevent rumination during web browsing. Participants were divided based on ACT-R models that they used into two groups: the control group (as Normal: NOR Model) and the test group (as inverted: INV Model). For each group, the task was divided into the mood-induction task (MI) and the main task (MT). Our aim is to detect and analyze the emotional responses of participants to determine how each model affects the participants in the MT. While the participants engaged in the two tasks, we measured and collected the different emotional response data, including physiological arousal (Heart Rate data) and facial expression (eye gaze data) separately, to make a dataset. Using the dataset, the support vector machine (SVM) successfully classified NOR and INV models in the MT, while the SVM model exhibits comparatively less accuracy in classifying the participants engaging the MI task in the two groups. The results simultaneously indicate the success of the INV model to prevent rumination and the effectiveness of using heart rate and eye movement to detect rumination during web browsing.
View full abstract
-
Lieu-Hen CHEN, Yen-Chia CHEN, Yuh-Ming HUANG
Session ID: 2N4-IS-2c-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Although there are many researches developed for NPR image synthesis using GANs, it is still difficult to create high-quality comic portraits of a real person. Moreover, there are few studies focused on the painting styles of comic artists, though it is the style that make a comic visually unique. And for comic readers, the synthesized comic portraits can be more attractive and meaningful if the portraits are presented in the user-preferred comic styles. Therefore, in this paper, we propose a styled comic portraits synthesis system based on CycleGANs. By integrating Deep Learning and NPR techniques, we aim to transform user’s real pictures into comic portraits with features preserved and defined painting style presented. We first trained a CNN to classify the painting styles of manga artists. Then we trained our GANs with classified and augmented data set, which is generated by mapping comic characters’ 2D texture onto perturbed and deformed 3D facial models. The experiment results shown that the proposed method can successfully create clear and vivid comic portraits, which has a great potential to serve as a useful tool for social network and comic industry.
View full abstract
-
Chia-Hui CHANG, Chen-Yu CHEN, Arden CHIOU
Session ID: 2N4-IS-2c-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
For online music streaming platforms, social network analysis has provided extra information for hit song prediction as social networks become a new channel for the public to express their opinions toward all possible topics. This research exploits social network analysis for hit song prediction via singer popularity and aspect-based sentiment analysis. For each song, we analyze the popularity of the singers and songs on the social network ”PTT”, and apply the aspect-based sentiment analysis (ABSA) to perform sentiment analysis on the singer. These results are combined with platform information to predict the playbacks of popular songs. Experimental results show that adding "singer's popularity" and "target emotion" can reduce the RMSE (Root Mean Square Error) of subsequent on-demand songs.
View full abstract
-
Ting LI, Qiaofei WANG, Xinyang JIANG, Guolin CAO, Gang QIAO
Session ID: 2N4-IS-2c-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Typing on a touch screen keyboard through an input method is the key user interaction for communication on mobile devices. Due to the limited keyboard size, input errors happen frequently in typing progress which affect the fluency of users’ input experience seriously. In this paper, we proposed an error correction framework based on neural networks for correcting input errors in typing progress and predicting the expected character users would like to type. Detailed features such as the coordinates of the touch points, the context information and the input history are preprocessed and utilized to train this neural classification model. Our experiments show that the proposed model is able to rectify the incorrect touches effectively and enhance both word-level precision and character-level precision to a great extent comparing with existing methods for multiple languages.
View full abstract
-
Kazunori YAWATA, Tamao SUZUKI, Keisuke KIRYU, Ken MOHRI
Session ID: 2N4-IS-2c-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
The recent development of natural language processing technology using deep learning has been remarkable. BERT, developed by Google, and GPT, developed by the OpenAI Foundation, have contributed to this development. In this experiment, we compared the performance of the Japanese BERT model, one of the latest natural language processing technologies, with Word2Vec, one of the conventional methods. We used data from the LiveDoor news corpus for the experiments. We also built a FAQ chatbot and compared the rate of correct answers to questions about news articles asked by users between BERT and Word2Vec. In our experiments, BERT showed superior performance compared to Word2Vec. We were also able to obtain specific insights into the factors that contributed to the performance of BERT, and were able to objectively evaluate the performance of the Japanese BERT model.
View full abstract
-
Ryuji TAMAKI, Toshiaki NOUMI, Takaaki SATO, Seiichi INOUE, Kugatsu SAD ...
Session ID: 2Xin5-01
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Prediction of B-cell epitopes and peptide binding affinity to MHCII are both important tasks in vaccine development. B-cell epitope prediction is useful for the design and development of vaccines that induce antigen-specific antibody production. On the other hand, binding prediction between peptides and MHC class II molecules is also necessary for the research of vaccines that activate T cells to reduce the severity of infection. Conventional methods using machine learning for these prediction tasks have the following two problems: The first is that they do not capture the complex dependencies between distant residues. The second is that the accuracy is low when the training data is insufficient. To address these challenges, we propose a method using a BERT model with a self-attention mechanism, which is pre-trained using a large scale protein database. Experimental results show that our proposed method achieves better performance than the previous methods in predicting B cell epitopes and peptide binding to MHCII. We also visualize and analyze the derived self-attention from a biological viewpoint focusing on the protein structure and function.
View full abstract
-
Toshiki IWAI, Kyoko KATSUMATA, Yasuo SUGITANI
Session ID: 2Xin5-02
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Hypoparathyroidism is a disease characterized by hypocalcemia and hyperphosphatemia due to low levels of parathyroid hormone. The main symptoms of hypoparathyroidism are due to hypocalcemia, and include paresthesia, numbness, cramps, tetany, and convulsions. Oral active vitamin D and calcium preparations are the basis of standard therapy of chronic hypoparathyroidism to keep serum calcium levels. However, the main adverse effects of these therapies are hypercalcemia and hypercalciuria. In this study, we predicted the occurrence of benefits and risks of active vitamin D in patients with hypoparathyroidism using a medical real world data from clinical practice and evaluated the predictability. Although the prediction of benefit and risk could not be achieved sufficiently, certain results were obtained when limited to the prediction of benefit, and the factors contributing to the prediction model were partially interpretable.
View full abstract
-
Norihiro OKADA, Yasuo SUGITANI
Session ID: 2Xin5-03
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, large-scale databases have been developed and are expected to be used in epidemiological research and safety policies for drugs in the healthcare field. Among them, the data based on medical claims, which is used in this study, are more advanced in database development in terms of ease of data structuring and comprehensiveness of medical records and target subjects. Using this database, we present a method of data mining by creating the network structure of treatment and drug combination for the purpose of analyzing similarities and relationships in the use of drugs prescribed to the patients with COVID-19. By classifying the nodes of the network based on graph embedding, evaluating the identified communities based on clinical information such as the duration of hospitalization and mortality of patients, and visualizing the communities, we will show that this method helps to understand the total picture of the treatment that was provided. This method may be applicable to diseases with various pathological symptoms and diseases for which there are many approved drugs.
View full abstract
-
Tetsuya SHIRAISHI
Session ID: 2Xin5-04
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
During the Spring in 2020 coronavirus epidemic (COVID-19), PCR testing resources were scarce, and there was a need to tighten testing standards. The blood tests of COVID-19 PCR-positive patients were known to show certain trends in white blood cell count (WBC), C-reactive protein (CRP), lymphocyte count (Lymph), and platelet count (PLT), but no cutoff values were found. In order to determine the cut-off values, statistical analysis and supervised machine learning (ensemble learning of neural networks and gradient boosting trees) were performed on 328 patients who underwent PCR and blood tests simultaneously. The supervised machine learning with 27 explanatory variables, which were significantly different by statistical testing, showed an AUC of 83.6% (sensitivity 63.2%, specificity 94.1%). Factors that contributed highly to prediction were (1) presence of a co-resident with similar symptoms, (2) presence of cough, (3) PLT, and (4) presence of taste abnormalities. The cut-off values of WBC less than 5,200, Lymph less than 1,000, PLT less than 200,000, and CRP less than 10 were good discriminators in blood tests. By adding the blood data points with the above cut-off values to the environmental factors and clinical symptoms, we were able to provide physicians with supplementary information for making decisions on PCR testing.
View full abstract
-
Keisuke TSUKADA, Yasuo SUGITANI
Session ID: 2Xin5-05
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In this study, we applied deep learning model based Neural-ODE and GRU to actual medical data and examined whether it is possible to monitor patient conditions by, comparing them with outcome and severity indices. In routine care, records of clinical procedures and laboratory values are events that occur sporadically in the time series direction, and not all the values of variables are acquired at each time point. Therefore, we utilized a deep learning model with a structure that can handle sporadic inputs. This made it possible to associate latent variables changing over time with indicators of the patient's condition.
View full abstract
-
Yasuhisa KITAI, Yoshiki CHIGIRA, Teruaki SAITOH, Naoki KURITA, Yuichi ...
Session ID: 2Xin5-06
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, development of AI technology has been remarkable. It is expected that achievements of AI technology will be applied to real world, such as visual inspection in factories. However, in sensory inspection that relies on human senses such as visual inspection, it is difficult to define the limit samples. In this paper, we consider clear normally and anomaly samples to improve the efficiency of the process of visual inspection. We propose a method of constructing a data set consisting of only clear normally and anomaly samples and a method of expressing samples with ambiguous appearance as continuous numerical values from a model that has learned from clear normally and anomaly samples. As a result, we confirmed that the time to label ground truth was reduced and samples with ambiguous appearance changed continuously according to the inferred score.
View full abstract
-
Eiji SHINKAWA, Koichi NAGATSUKA, Yuki MURATA, Tamiko ONO, Masae HOSODA ...
Session ID: 2Xin5-07
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
This paper proposes an attention neural network model which aims to predict and analyze interactions between drugs and glycoproteins by focusing on functions of glycans. This model firstly receives drug information, amino acid sequence information, and glycan information as input. Next, it extracts feature vectors via their respective encoders. Then, those features are weighted through mutual attention mechanism, and finally concatenated to detect interaction. The model represents the process by which a glycan mediates the interaction between glycoproteins and drugs through the attention from the glycan to the drug and the attention from the drug to the amino acid sequence. The experimental results show that the proposed mutual attention neural network predicts interactions well and attention analysis suggests candidate interactions between glycans and drugs and between drugs and amino acids.
View full abstract
-
Yukari TEZUKA, Kaoru HASHIZUME, Sadaoki SAKAI
Session ID: 2Xin5-08
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
It is important that clinical research results are published in appropriate medical journals at the appropriate time and widely disseminated to the medical community, but objective indicators for selecting the submission destination of medical journals are absent. Therefore, we aimed to generate a model by machine learning to select the most suitable predictive science journal for the submission of papers based on the characteristics of clinical research. We have created a database of information on the characteristics and articles of clinical studies in the 194 lung cancer fields published in the past. Data of 26 items such as research design and research scale were set as explanatory variables. The impact factor (IF) and the article rank (A / B / C / D) based on the IF were set as the information of the medical journals to be explained. Using DataRobot, a machine learning platform, we generated a model that predicts the IF or treatise rank of medical journals from the characteristics of clinical research. This made it possible to predict the IF and article rank of medical journals suitable for submission based on the characteristics of clinical research. The factors that determine the impact of the study are thought to change over time, so it is necessary to continue to verify the prediction model.
View full abstract
-
Masanao OCHI, Masanori SHIRO, Jun'ichiro Mori MORI, Ichiro SAKATA
Session ID: 2Xin5-09
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
It is essential to identify promising research early for a government or company to decide the future research direction. Besides, with the increase in the digital publication of scientific literature and the increasing fragmentation of research, there is a need to develop techniques to predict future research trends automatically. Previous research on predicting the impact of scientific research has been conducted using specially designed features for each indicator. On the other hand, recent advances in deep learning technology have facilitated integrating different individual models and constructing more general-purpose models. However, the possibility of using deep learning techniques to predict the impact indicators of scientific research has not been sufficiently investigated. In this paper, we extracted the number of citations after publication, which is one of the typical impact indicators of scientific research, and the corresponding information in the academic literature as a distributed representation. We analyzed the possibility of identifying papers with high impact.
View full abstract
-
Toranosuke DAITO, Mitsuo YOSHIDA, Kyoji UMEMURA
Session ID: 2Xin5-10
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Chat systems are being used to activate discussion at conferences. In general, it is desirable that many questions are sent to chat systems. However, if no one may answer the question, one will hesitate to make a question. In this study, we assume that users will send more questions if the questions always receive some sort of response. We propose a chat system that uses a question answering system (QA system) as a bot that always responds to questions. This chat system is not a general chat system but a system dedicated to questions and their answers. We actually created the proposed system and used it in our laboratory meeting to verify the functionality and its design. As the result, we have verified that all the questions are actually answered by the QA system.
View full abstract
-
Sho YANASE, Yasuhiro NOGUCHI, Satoru KOGURE, Koichi YAMASHITA, Tatsuhi ...
Session ID: 2Xin5-11
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In college education, students acquire their logical writing skills by not only lectures on writing but also feedbacks from teachers on reports submitted for experiments courses. After interpreting the logical structure of the students' reports, teachers provide their comments to indicate the ambiguities in the logical flow, the lack of evidence corresponding to the claims, and the sentences with ambiguous evidence in support of the claims. However, students often misunderstand the comments and ignore them. It is because that these comments are based on the teachers' interpreted logical structure; students who believe that their own reports describe different logical structure cannot understand the comments. In this study, we propose a teacher support system that visualizes the logical structure of a paragraph in students' reports, so that the teacher can create the feedback with the visualized logical structure. In this paper, we introduce the prototype system and its algorithms that classify sentence types and analyze the logical relations between sentences.
View full abstract
-
Shuichiro TANAKA
Session ID: 2Xin5-12
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
The concept of “GIGA School” is promoting in Japan. “GIGA School” aims to improve the wireless LAN environment for all classrooms and introduce one terminal device for each student. With the increasing utilization of such tablet terminals for education, coding tools for screen contents (SC) are highly expected to reduce the communication bandwidth and increase the number of simultaneous connections. Screen content coding (SCC) has achieved higher coding efficiency by introducing several new modes for SC. On the other hand, since these coding tools are dedicated to SC, redundant operation occurs when a natural image (CC) is encoded. In this paper, we propose an image classification model that classifies SC and CC using machine learning. By using this model before encoding the video, the proposed algorithm can select the coding tool with high accuracy and significantly reduce the coding redundancy.
View full abstract
-
Kosuke KIMOTO, Masato SOGA
Session ID: 2Xin5-13
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In general, we recognize an object, determine the next action to choose, and execute the optimal action. Then, the state of the object changes, and the cycle of recognizing it is repeated. The repetition of this cycle of interaction between recognition, judgment, and action can be called an experience. Memories that involve experiences are called episodic memories and are more likely to be retained than mere semantic memories. However, in school, what is learned by reading textbooks is only recognition, and thus does not go through the cycle of interaction and is not accompanied by experience. Therefore, we propose a learning support environment that allows a single learner to learn with experience, we use language word learning as the learning content. In order to incorporate the cycle of interaction and action into the learning process, we present the state of an object in VR, and after the learner recognizes it, he or she judges which word to touch on the display in order to change the state of the object to the intended state. If the learner touches the selected word, the state of the object will change according to the meaning of the word. Then, we developed and evaluated a word learning system using VR that incorporates this kind of cycle in learning. As a result, to change the state of the object were useful for deepening the understanding of word meanings through the that cycle and for adding fun to learning.
View full abstract
-
Misa SATO, Hiroko OTAKI, Yasuhiro SOGAWA, Kohsuke YANAI
Session ID: 2Xin5-14
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
There are business situations in which reasoning are made by connecting multiple knowledge from large text documents, such as research papers and business documents. In such cases, especially for those who are not familiar with the business domain, it is necessary to repeat the process of searching for relevant documents, reading, re-searching, and reading again and again because they do not have enough knowledge. This tool supports this inefficient repeated process. The tool is equipped with a knowledge DB that has relation knowledge extracted from a large amount of text in advance, using natural language processing technology. Users can define a reasoning hypothesis to be examined on the tool, and then retrieve the relevant knowledge from the DB one after another to easily make inferences using the large amount of knowledge. In this paper, we describe the details of the method and its effectiveness.
View full abstract
-
Naoki IINUMA, Fusataka KUNIYOSHI, Jun OZAWA, Makoto MIWA
Session ID: 2Xin5-15
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
The task of extracting material property values described in the text of material papers has attracted much attention among materials researchers. However, many property values cannot be extracted by natural language processing alone because property values are often described in graphs rather than in the text in materials papers. In this study, to extract property values from graphs, we constructed a dataset by classifying graphs of property values into classes based on various property conditions such as temperature and time. The dataset was constructed by extracting graph images from a large collection of journal data in the field of materials and utilizing crowdsourcing to annotate the images in a short period of time. In addition, we built several deep learning models and trained and evaluated them on the dataset. As a result, we confirmed the usefulness of our dataset for classifying graphs of property values using deep learning models.
View full abstract
-
Kohei MAKINO, Fusataka KUNIYOSHI, Jun OZAWA, Makoto MIWA
Session ID: 2Xin5-16
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In the inorganic materials field, research has been conducted to extract target materials, which are the synthetic materials claimed in the papers, to focus on synthetic materials and analyze their physical properties. For the extraction, there is a question whether the conventional named entity recognition systems can extract such target materials. In this study, we built a corpus of papers labeled only with the target materials and applied a deep learning modes, which have shown high performance in conventional named entity extraction extraction tasks, to the corpus to evaluate the extraction performance of the target materials. As a result, we found that the performance of the deep learning model in extracting target materials was lower than that reported in other named entity recognition tasks. We attribute this to the fact that the conventional named entity recognition task settings are not suitable for the task of extracting target materials from articles, and we discuss the shortcomings of the existing task settings and ways to improve them.
View full abstract
-
Fuminori UEMATSU, Shigeyuki MORISHITA, Tomohiro NAKAMICHI, Keito AIBAR ...
Session ID: 2Xin5-17
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
For high-resolution observation using a scanning transmission electron microscope (STEM), it is necessary to correct the aberrations of the lens that causes blur in images. In order to correct the aberrations, it is important to measure the aberration values correctly, and we have developed an aberration measurement method using a Ronchigram as a simple measurement method. However, the conventional method using a Ronchigram requires prior adjustment of the measurement parameters because depending on the experimental conditions, the measurement accuracy is significantly degraded when coma aberration is large, which is problematic for automating the measurement and correction of aberrations. In this study, we developed a machine learning regression model to measure coma aberration from a Ronchigram in order to realize a highly accurate measurement without setting the parameters. In order to estimate the aberration using the position and shape of the stripes appearing in a Ronchigram, we used a convolutional neural network in the model structure. By incorporating this regression model into STEM, it became possible to automatically and accurately measure and correct coma aberration.
View full abstract
-
Akira NAKASHIMA, Masato SOGA
Session ID: 2Xin5-18
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Currently, there are many contents related to celestial bodies in the world. However, many of the functions realized there are introduced in units of constellations, and there are not many that introduce information about each star. In previous research, Names of stars and constellations were presented using the method of reading aloud with voice, but color and brightness of the stars were not presented. Therefore, in this research, we developed a system that can quickly present information of the color and brightness of stars by associating them with pitches and volumes of sounds. In the evaluation experiment, we asked participants to solve the problem while using this system and the voice reading system, and verified the speed and accuracy of grasping the information of each system.
View full abstract
-
Kimikazu TANASE, Rie HONDA
Session ID: 2Xin5-19
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
Statisitcal modelling method for seasonal change of vegetation indices is developed. The seasonal change of vegetation indices is modelled by segmented logistic functions and their parameters are determined by Maximum a posterior method by using their prior probability. Furthermore, obtained parameters are clustered by using k-means. Change of the vegetation at a specific point is detected by the discrepancy between most frequent cluster ID and the cluster ID of a specific year.
View full abstract
-
Masahiro YOSHIDA, Kano HASEGAWA, Kei TATENO
Session ID: 2Xin5-20
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In recent years, research on adoptation of generative models to content creation has begun to attract attention.In general, generative models set the ”reality” of the product as the objective function. However, since manycommercial contents (movies, music, games, etc.) are created to be appreciated by a larger number of users or aspecific target user group, the high reality alone is not enough for the content to be practical. Existing research hasproposed a generative model for content metadata that considers preferences of a targeted user groups, using dataof user ratings for the contents. However, even if, for example, a creator wants to create a movie of a specific genre,the existing model cannot control the generated content. In this study, we propose a novel model that extends theexisting model to control the generated content by using arbitrary metadata as input. In our experiments, we usedthe Internet Movie Database and generated actor candidates using the movie genre as input.
View full abstract
-
Motonobu UCHIKOSHI, Koji ICHIKAWA, Yuka AONO, Takashi URATA
Session ID: 2Xin5-21
Published: 2021
Released on J-STAGE: June 14, 2021
CONFERENCE PROCEEDINGS
FREE ACCESS
In anomaly detection, data other than known anomalies are often learned as normal data, and if the training data contains overlooked anomalies, the model performance may deteriorate. Especially when there are only a few known anomalies, it is difficult to automatically and efficiently extract overlooked anomalies. In order to extract the overlooked anomalies contained in the training data, we propose a reinforcement learning model that selects features so that known anomalies are separated into smaller clusters for the entire data. Experiments suggest that many of the known and overlooked anomalies are classified into smaller clusters, and that the model performance is improved by removing the clusters from the entire data to train the model. In addition, the selected features can be expected to provide interpretability to the known anomalies.
View full abstract