Autonomous cars have been gaining attention as a future transportation option due to an envisioning of a reduction in human errors and achieving a safer, more energy efficient and more comfortable mode of transportation. However, eliminating human involvement may impact the usage of autonomous cars negatively because of the impairment of perceived safety and enjoyment of driving. In order to achieve a reliable interaction between an autonomous car and a human operator, the car should evince intersubjectivity, implying that it possesses the same intentions with as those of the human operator. One critical social cue for human to understand the intentions of others is eye gaze behaviours. This paper proposes an interaction method that utilizes the eye gazing behaviours of an in-car driving agent platform that reflects the intentions of a simulated autonomous car that holds a potential of enabling the human operators to perceive the autonomous car as a social entity. We conducted a preliminary experiment to investigate whether an autonomous car will be perceived as possessing the same intentions as a human operator through gaze following behaviors of the driving agents as compared to the conditions of random gazing as well as not using the driving agents at all. The results revealed that gaze-following behavior of the driving agents induces an increase in the perception of the intersubjectivity. Furthermore, a detailed eye gaze data analysis remarked that the gaze following behaviors of the robots received more attention from the driver. Finally, the proposed interaction method demonstrated that the autonomous system was perceived as safer and more enjoyable.
We have been developing a speech-based “news-delivery system”, which can transmit news contents via spoken dialogues. In such a system, a speech synthesis sub system that can flexibly adjust the prosodic features in utterances is highly vital: the system should be able to highlight spoken phrases containing noteworthy information in an article; it should also provide properly controlled pauses between utterances to facilitate user’s interactive reactions including questions. To achieve these goals, we have decided to incorporate the position of the utterance in the paragraph and the role of the utterance in the discourse structure into the bundle of features for speech synthesis. These features were found to be crucially important in fulfilling the above-mentioned requirements for the spoken utterances by the thorough investigation into the news-telling speech data uttered by a voice actress. Specifically, these features dictate the importance of information carried by spoken phrases, and hence should be effectively utilized in synthesizing prosodically adequate utterances. Based on these investigations, we devised a deep neural network-based speech synthesis model that takes as input the role and position features. In addition, we designed a neural network model that can estimate an adequate pause length between utterances. Experimental results showed that by adding these features to the input, it becomes more proper speech for information delivery. Furthermore, we confirmed that by inserting pauses properly, it becomes easier for users to ask questions during system utterances.
A natural conversation involves rapid exchanges of turns while talking. Taking turns at appropriate timing or intervals is a requisite feature for a dialog system as a conversation partner. We propose a Recurrent Neural Network (RNN) based model that takes the current utterance and the dialog history as its input to classify utterances into turn-taking related classes and estimates the turn-taking timing. The dialog history is represented by a sequence of speaker-specified joint embedding of lexical and prosodic contents. To this end, we trained a neural network to embed the lexical and the prosodic contents into a joint embedding space. To learn meaningful embedding spaces, the prosodic feature sequence from each single utterance is mapped into a fixed-dimensional space using RNN and combined with utterance lexical embedding. These joint embeddings are then shifted to different parts of embedding spaces according to the speakers. Finally, the speaker-specified joint embeddings are used as the input of our proposed model. We tested this model on a spontaneous conversation dataset and confirmed that it outperformed conventional models that use lexical/prosodic features and dialog history without speaker information.
In drug development, Drug-Induced Liver Injury (DILI) is a significant cause of discontinuation of development, and safety evaluation and management technology at early development stage are highly required. In recent years, toxicity prediction by in silico analysis is expected, and the machine learning research using omics data has attracted attention. However, the lack of explanation of machine learning is a problem. In order to make an appropriate safety assessment, it is necessary to clarify the mechanism of the toxicity (toxic course). In this study, we focus on the toxic course and propose an ontological model of the liver toxicity, which systematizes toxicity knowledge based on a consistent viewpoint. In application research, we introduce a prototype of a knowledge system for supporting toxicity mechanism interpretation. Based on the ontology, this system provides information flexibly according to the user's purpose by using semantic technologies. The system provides a graph visualization function in which nodes correspond to concepts and edges correspond to interactions between concepts. In such a visualization function, a toxic course map shows causal relationships of the toxic process. We illustrate examples of application to safety assessment and management by combining ontological and data-driven methodologies. Our ontological engineering solution contributes to converting from data to higher-order knowledge and making the data explainable in both human and computer understandable manner. We believe that our approach can be expected as a fundamental technology and will be useful for a wide range of applications in interdisciplinary areas.
Non-task-oriented dialogue systems are required to chat with users in accordance with their interests. In this study, we propose a neural network-based method for estimating speakers’ levels of interest from dialogues. Our model first converts given utterances into utterance vectors using a word sequence encoder with word attention. Afterward, our novel attention approach, sentence-specific sentence attention extracts useful information for estimating the level of interest. Additionally, we introduce a new pre-training method for our model. Experimental results indicated that it was most effective to use topic-specific sentence attention and proposed pre-training in combination.
Regional revitalization is required in Japan because the issues caused by the declining birthrate, aging population, and depopulation etc., are increasing in rural area. New problem-solving methods which are generated through co-creation are required in regional revitalization. In co-creation, social media is important in order to continuation and sharing of activities. In the past, social media existed to support co-creation in order to solve issues in specific or global area, but there were no social media that support to solve concrete issues common to multiple regions. This study proposes social media to support co-creative activities in order to solve regional issues common to multiple regions. Promotion and sharing of co-creative activities are important factors in this social media. The social media we named "MiraiLab" was designed to satisfy these two factors. To confirm the effectiveness of “MiraiLab,” social experiments were conducted in “Special Interest Group on Crowd Co-creation Intelligence (SIG-CCI)” at The Japanese Society for Artificial Intelligence. As results, it was found that “MiraiLab” was effective for the promotion and sharing of co-creative activities. Specifically, regarding promotion, “MiraiLab” contributed to the continuation of co-creative activities and the creation of outcomes. Regarding sharing, it was found that co-creative activities were shared between projects, between research groups, and between account registrants and non-registrants through “MiraiLab”.
Hyperparameters optimization for learning algorithms and feature selection from given data are key issues in machine learning, would greatly affect classification accuracy. Random forests and support vector machines are among some of the most popular learning algorithms. Random forests that have relatively few hyperparameters, can perform more accurate classification by optimizing these parameters without requirement of feature selection. Same as random forests, support vector machines also have a few hyperparameters. However, whether or not to perform feature selection at the same time as optimization of these parameters greatly affects classification accuracy. Usually, grid search method is used to optimize hyperparameters. However, since this search method is performed on predetermined grids, the detailed optimization cannot be realized. In this paper, we thus introduce an artificial bee colony (ABC) algorithm to optimize hyperparameters and to perform more accurate feature selection. ABC algorithm is one of the swarm intelligence algorithms used to solve optimization problems which is inspired by the foraging behaviour of the honey bees. Using KDD Cup 1999 Data that is a benchmark of network intrusion detection classification, experimental results demonstrate the effectiveness of our method. The proposed method is superior in classification accuracies to existing methods for the same data, where swarm intelligence is used to hyperparameters optimization and feature selection. Our method also shows better performance than classification accuracies of random forests and SVM that are learned using default parameters values provided by scikit-learn, an open source machine learning library for Python.