Multi-Agent Path Finding (MAPF) problem is an extended class of pathfinding problems where multiple paths are assigned to multiple agents without collision. MAPF problem can been applied to various practical domains including path-planning of robot carriers in warehouses, automated taxiing of airplanes, and video games. Multi Agent Pickup and Delivery (MAPD) problem that is a continuous MAPF problem has been studied for more practical cases such as automated warehouses in which robot carriers are assigned to tasks that appear over time. A major goal of MAPD problems is to reduce the execution time or the total path length for the tasks. Token Passing (TP) has been proposed as a solution method for MAPD problems. TP is a task allocation method that also employs a pathfinding algorithm such as Cooperative A* (CA*) or Multi-Label A* (MLA*) algorithm. With TP, pickup-and-delivery tasks are sequentially allocated to agents by employing a shared memory called token where the agents reserve their paths of tasks. In this study, we focus on two issues of the TP. Firstly, TP targets well-formed problems which satisfy the sufficient condition for solvable MAPD instances. The well-formed MAPDs also limit the passable nodes on a graph representation of a warehouse and decreases the utilization of warehouse space in pathfinding. However, if this restriction is relaxed, the collision between agents’ paths might occur. Secondly, the task allocation is greedily performed depending on a default order in which unallocated agents access the token. However, there might be another agent whose current task is almost completed and its current destination is nearer the pickup location of a new task. Without considering such an agent, it might generate a redundant path by allocating a new task to an inappropriate agent. To address the first problem, we relax the movement limitation and introduce a reservation method for stays at destinations that maintains the consistency of reserved paths of agents. For the second problem, we propose a task allocation method that estimates the pickup time of agents to determine more appropriate agents to be allocated to new tasks. Experimental results show that our proposed algorithm reduces the total path length and the total task execution time by reducing the length of pickup paths in comparison to the existing method.
Individual treatment effect (ITE) represents the expected improvement in the outcome of taking a particular action to a particular target, and plays important roles in decision making in various domains. However, its estimation problem is difficult because intervention studies to collect information regarding the applied treatments (i.e., actions) and their outcomes are often quite expensive in terms of time and monetary costs. In this study, we consider a semi-supervised ITE estimation problem that exploits more easily-available unlabeled instances to improve the performance of ITE estimation using small labeled data. We combine two ideas from causal inference and semi-supervised learning, namely, matching and label propagation, respectively, to propose Counterfactual Propagation; CP which is the first semi-supervised ITE estimation method. Experiments using semi-real datasets demonstrate that the proposed method can successfully mitigate the data scarcity problem in ITE estimation.
The global outbreak of COVID-19 is now putting enormous pressure on society to change our traditional social behavior. Government officials are also forced to make short-term decisions based on limited information for public health, and investor sentiment about the infection situation in each country has a significant impact on the stock market. In this paper, we attempt to visualize the time series of indexed social sentiment in Japan under the COVID-19 pandemic by using a neural network approach, and clarify changes in the sensitivity of citizens to the coronavirus. The sentiment was classified for Twitter tweets that matched the keywords for which the government was asked to restrict action, and sentiment trends were identified for the period from before the outbreak to the fifth wave of infection in Tokyo, Sapporo, Osaka, and Fukuoka. The indices obtained show a correlation with the number of infected cases by region and with national and local events, and in global cities such as Tokyo and Osaka as they experienced waves of infections and emergency declarations, sensitivity gradually became paralyzed, and parallel trends in sentiment waveforms were observed among regions.
Many mothers have considerable anxiety about pregnancy, childbirth, and childcare. For such mothers, searching for information on the Internet is an effective means of dissolving their anxieties. We consider the problem of estimating, for each search word, a distribution of search dates with respect to children’s birth dates. Most of the empirical distributions have unimodal or bimodal shapes, and some of them are asymmetric about extremal points and rise or fall sharply. We propose nonparametric estimation methods based on mathematical optimization models for such probability distributions. Our unimodal and bimodal optimization models automatically estimate the optimal extremal points and can be extended to multimodal distributions. These models are formulated as mixed-integer convex quadratic optimization problems, which can be solved exactly using optimization software. Experimental results using real-world and synthetic datasets demonstrate that our methods are effective by comparison to conventional moving average and kernel estimation methods.
AI technology brings major transformation to the human society, while it may conflict with the values (ethics) of the human society, depending on how it is used. With this background, a number of “AI principles” have been developed as a code to AI ethics by major countries and institutions.
Intelligent dialogue systems (IDSs), as a part of AI technologies, will play an important role as an interface of "human-machine (AI system)" in the future, and therefore, developers and users of IDSs need to be aware and manage the risk of ethical aspects of IDSs, including those related to "human-machine" relationship.
The purpose of this paper is to clarify the possible ethical risks (including legal risks) of IDSs through analysis of AI regulation in Europe, which leads the discussion on AI ethics in the world, with its reference of its social acceptance and cultural background regarding “human-machine” relationship.
For that purpose, this paper first shows the direction of future technological development of the IDSs, and then identifies the characteristics of AI principles of Europe, which may be affected by its cultural background related to the "human-machine" relationship. Then, through analysis of the recently proposed European AI Act with considering future technological development of IDSs and its characteristics, this paper clarifies the possible ethical risks for the future development and practical application of IDSs are not only those related to shared human rights such as fairness/non-discrimination and privacy but also those related to the cultural differences on the view of “human-machine” relationship.
Dialogue system development involves a variety of factors and requires multifaceted consideration, so design guidelines would be helpful. Although a neural-based approach can be used, it requires a vast amount of dialogue data and would take too much effort to collect them to develop a system for a specific and fixed-length dialogue. Furthermore, errors in automatic speech recognition and language understanding should be explicitly considered in the design because they are inevitable when the system talks with general users and would lower their impressions. We propose design guidelines for developing dialogue systems. Our systems developed with the aid of these guidelines took first place in two dialogue system competitions: the situation track of the second Dialogue System Live Competition and a pre-preliminary test of the Dialogue Robot Competition. Our proposed design guidelines are to (1) make the system take initiative, (2) avoid dialogue flows from relying too much on user utterances, and (3) include in system utterances that the system understands what the user said. We also show more details regarding the systems designed for each of the two competitions with examples, such as the dialogue examples in the two competitions and the scores of questionnaire by real users.
Generation-base dialogue system tends to produce generic response sentences. In order to improve the diversity of response sentences by the generation-base dialogue system, the response text retrieved by the retrieval-base model can be input to the generation-base model as reference response text, so that the generation-base model can generate highly diverse response sentences. However, the prior works show that the generation-base dialogue system often ignores the reference response text, resulting in the response sentences that is unrelated to the reference response text. In this work, we propose the Dialogue-Filling method, which can utilize 100% of the reference response text by masking the response sentences with a text-filling technique. We built variants of Dialogue-Filling method with DialoGPT model. Experiments on the DailyDialog Dataset demonstrate that our Dialogue-Filling method outperforms the baseline method on the dialogue generation task.
This paper proposes a new method for slot filling of unknown slot values (i.e., those are not included in the training data) in spoken dialogue systems. Slot filling detects slot values from user utterances and handles named entities such as product and restaurant names. In the real world, there is a steady stream of new named entities and it would be infeasible to add all of them as training data. Accordingly, it is inevitable that users will input utterances with unknown slot values and spoken dialogue systems must correctly estimate them. We provide a value detector that detects keywords representing slot values ignoring slots and a slot estimator that estimates slots for detected keywords. Context information can be an important clue for estimating slot values because the values in a given slot tend to appear in similar contexts. The value detector is trained with positive samples, which have keywords corresponding to slot values replaced with random words, thereby enabling the use of context information. However, any approach that can detect unknown slot values may produce false alarms because the features of unknown slot values are unseen and it is difficult to distinguish keywords of unknown slot values from non-keywords, which do not correspond to slot values. Therefore, we introduce a negative sample method that replaces keywords with nonkeywords randomly, which allows the slot estimator to learn to reject non-keywords. Experimental results show that the proposed method achieves an 6,15 and 78% relative improvement in F1 score compared with an existing model on three datasets, respectively.
In language understanding for dialog systems, slot filling is a fundamental task usually formulated as a sequence labeling problem and solved using discriminative models such as conditional random fields and neural networks. One of the weak points of the discriminative approach is in the robustness against incomplete annotations, which are often generated in practice when we attempt to build large-scale training data. For making the slot filling algorithm more robust against the incompleteness of annotation, this paper leverages an overlooked property of slot filling tasks: Non-slot parts of utterance follow a specific pattern depending on the user’s intent. To reflect this idea, we propose a nonparametric Bayesian model that induces the grammatical role of the non-slot parts using a segmentation-based formulation of slot filling tasks. The proposed method can naturally deal with training data that includes incomplete annotations as a partially supervised grammar induction problem. The experimental result demonstrates that the proposed method estimates the slot information more accurately in a situation that the training data includes incomplete annotations in comparison to the BiLSTM-CRF and HMM.We also show that the proposed model has an advantage in the interpretability of the result of training and prediction by visualizing the parameters and the estimated labeled segmentations with a state transition diagram.
In this study, we propose a method for generating response utterances which take into account contexts and topics of the dialog by complementing omitted words such as subjects in the input utterances of dialog systems. In order to complement omitted words in the input utterances, an automatic anaphora resolution based on the centering theory is performed. To achieve highly accurate anaphora resolution, we also performed spoken-to-written style conversion based on sequence-to-sequence model using LSTM as a preprocessing. The results of evaluation experiments using NUCC, the Nagoya University Conversation Corpus showed that our proposed complementation method works robustly against errors in spoken-to-written style conversion.
In this paper, we propose the “Risky Politeness Strategy (RPS)” as a framework of utterance strategy focusing on risk-taking in dialogue systems. In previous research, it has been reported that it is useful to implement politeness strategies that have risks such as jokes and compliments in dialogue systems. On the other hand, a design theory for effectively implementing risk-taking utterance strategies in dialogue systems has not been established. Against this background, we defined RPS with reference to politeness/impoliteness research in the fields of linguistics. In addition, we developed a rule-based dialogue system and an example-based dialogue system to implement the RPS in a non-task-oriented dialogue. User evaluations were conducted through the preliminary rounds of the Dialogue System Live Competition 2 and 3. The results of the user evaluations showed that the rule-based and example-based RPS-speaking non-task-oriented dialogue systems were able to engage in dialogue that was evaluated by the user as having humanity. Therefore, the usefulness of implementing RPS in non-task-oriented dialogue systems has been shown at a certain level.
In human-human interactions, a listener uses both verbal tokens and head nods for responding signals, and they frequently co-occur. When humanoid robots and anthropomorphic agents response to a user using verbal tokens and head nods simultaneously, they must be generated in proper timing to each other and have consistent features. In this paper, we propose models to predict co-occurrence and physical features of head nods based on prosodic and syntactic features of verbal response tokens. We used, as predictive variables, the forms, positions, durations, averages/standard deviations of fundamental frequency and loudness of response tokens and head positions at the beginning of response tokens. In addition, considering participation framework, we also used speaker's gaze and listener's gaze at the beginning of response tokens, and applied generalized mixed models to predict the co-occurrence, type, range, repetition and velocity of head nods. The results confirmed that proposed models can predict these outcomes effectively.
The goal of this study is to realize a non-task-oriented dialogue agent that is accepted by people in the long term. One approach is using a dialogue strategy in which an agent shares information about other users who are not participating in the current dialogue. This study aims to develop a chatbot that is capable of sharing information about others and to examine its usefulness as well as its problems such as privacy concerns using a long-term empirical experiment in a real-world environment. The result of a 14-day experiment with 120 participants suggested that the usefulness of this dialogue strategy lies in its ability to maintain users’ motivation to interact with the agent and prevent them from having the impression that the agent is mechanical. However, irrespective of the presence of this dialogue strategy, it was suggested that the users were concerned about their privacy to the agent that collected their information on a daily basis. Based on these results, we discussed the relationship between the interestingness of the shared information and the users’ privacy concerns.