This paper addresses a method to analyze command utterance sentences in small computers providing various services. All compulsory command utterances for the services must be accepted, and the utterances by the user must be able to be learned at a low computational cost. So as to determine the chunks and their sense based on their service, our proposal method contains parsing arrays for each service and performs utterance analysis and learning through reinforcement learning. Using the results of an in-vehicle computer implementation for a car traveler, it was confirmed that the proposed method was generally successful, with an analysis accuracy of 0.99 in a closed test and 0.81 in an open test.
We aim to build a system to improve the quality of communication between elderly individuals and their families by sharing information about the elderly members’ quality of life (QOL). As the first step in the creation of a system to generate utterances containing information on QOL, we propose a methodology for building a QOL-labeled dialog corpus and discuss the generation of responses that transmit specific QOL information using QOL labels. To build the corpus, we conducted preliminary experiments to demonstrate that certain crowd workers provide responses that are indistinguishable from those given by elderly individuals. Based on this, we constructed a large-scale QOL-labeled dialog corpus using crowdsourcing effectively. Our response generation experiments using the constructed corpus demonstrated that the generated utterances can be useful in transmitting QOL information. For example, the prompt “She reads a book as soon as she gets home” generated the response “I’ll read her a book next time” in case of the QOL label is 《health satisfaction (positive)》 or “Even reading the newspaper is difficult for me” in case of 《health satisfaction (negative)》.
This study describes a segment-level metric for automatic machine translation evaluation (MTE). Although various MTE metrics have been proposed, most MTE metrics, including the current de facto standard BLEU, can handle only limited information for segment-level MTE. Therefore, we propose an MTE metric using pre-trained sentence embeddings in order to evaluate MT translation considering global information. In our proposed method, we obtain sentence embeddings of MT translation and reference translation using a sentence encoder pre-trained on a large corpus. Then, we estimate the translation quality by a regression model based on sentence embeddings of MT translation and reference translation as input. Our metric achieved state-of-the-art performance in segment-level metrics tasks for all to-English language pairs on the WMT dataset with human evaluation score.
The concept of surprisal was proposed by Hale as a psycholinguistic model of sentence processing costs based on the information theory. Surprisal measures a word’s negative log probability in context and can be used to model the difficulty in processing a sentence. If this difficulty is estimated using the eye-tracking method, the reading time can be estimated using base phrase units in Japanese. In addition, word probability is estimated from the frequency of morphemes or word units in Japanese. We introduced word embeddings to address the discrepancy in units, which makes it difficult to model surprisal in Japanese. The additive property of skip-gram word embeddings enabled us to compose a base phrase vector from word vectors in the base phrase. We confirmed that the cosine similarity between two adjacent base phrase vectors can be used to model the contextual probability of the bi-gram of the base phrase and found that the norm of the base phrase correlates with reading time in Japanese.