The automatic generation of metaphorical expressions helps us write imaginative texts such as poems or novels. This paper proposes a new metaphor generation task, evaluation metrics, and a method to solve the task. Our task is formalized as a problem of finding metaphorical paraphrases for a literal Japanese phrase consisting of a subject, an object, and a verb. We use four evaluation metrics: synonymousness, metaphoricity, novelty, and comprehensibility. Our proposed method generates metaphorical expressions by using three automatically computable scores—similarity, figurativeness, and rarity—corresponding to one of the evaluation metrics. By crowdsourcing, we show how these scores are related to those given by humans in terms of the evaluation metrics and how they are useful in finding human’s preferred expressions in pairwise comparisons.
This paper presents a contrastive analysis between reading time and clause boundary categories in the Japanese language in order to estimate text readability. We overlaid reading time data of BCCWJ EyeTrack, and clause boundary categories annotation on the Balanced Corpus of Contemporary Written Japanese. Statistical analysis based on the Bayesian linear mixed model shows that the reading time behaviours differ among the clause boundary categories. The result does not support the wrap-up effects of clause-final words. Another result we arrived at is that the predicate-argument relations facilitate the reading speed of native Japanese speakers.
To communicate with humans in a human-like manner, systems need to understand behavior and psychological states in situations of human-machine interactions, such as in the cases of autonomous driving and nursing robots. We focus on driving situations as they are part of our daily lives and concern safety. To develop such systems, a corpus annotated with behavior and subjectivity in driving situations is necessary. In this study, subjectivity includes emotions, polarity, sentiments, human judgments, perceptions, and cognitions. We construct a driving experience corpus (DEC) (261 blog articles, 8,080 sentences) with four manually annotated tags. First, we annotate spans with driving experience tags (DE). Then, three tags, other’s behavior (OB), self-behavior (SB), and subjectivity (SJ), are annotated within DE spans. In addition to describing the guidelines, we present corpus specifications, agreement between annotators, and three major difficulties during the development: the extended self, important information, and voice in mind. Automatic annotation experiments were conducted on the DEC using Conditional Random Fields-based methods. On the test set, the F-scores were about .55 for both OB and SB and approximately. 75 for SJ, respectively. We provide error analysis that reveals difficulties in interpreting nominatives and differentiating behavior from subjectivity.
All-words word-sense disambiguation (all-words WSD) involves identifying the senses of all words in a document. Since a word’s sense depends on the context, such as surrounding words, similar words are believed to have similar sets of surrounding words. Therefore, we predict target word senses by calculating Euclidean distances between the target words’ surrounding word vectors and their synonyms using word embeddings. In addition, we replace word tokens in the corpus with their concept tags, that is, article numbers of the Word List by Semantic Principles using prediction results. After that, we create concept embeddings with the concept tag sequence and predict the senses of the target words using the distances between surrounding word vectors, which consist the word and concept embeddings. This paper shows that concept embedding improved the performance of Japanese All-words WSD.
We focus on nonstandard usages of common words on social media, where words, sometimes, are used in a totally different manner from that of their original or standard usage. In this work, we attempt to distinguish nonstandard usages on social media from standard ones in an unsupervised manner. We also constructed new Twitter dataset consisting of 40 words with nonstandard usages and then used the dataset for evaluation in an experiment. For this task, our basic idea is that nonstandard usage can be measured by the inconsistency between the target word’s expected meaning and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and also provide us with findings, for example, on how to construct context embeddings, and which corpus to use.
Phonological changes reflected in text can be powerful in characterizing utterances of dialogue agents or characters’ lines in narratives. To use phonological changes to automatically characterize utterances, (i) we collected phonologically changed expressions from characters’ written utterances and (ii) formalized the knowledge required to generate phonologically changed expressions. In particular, we categorized the expressions into 137 patterns by analyzing them from the points of the phenomena concerned and the environments of the occurrences. We experimentally confirmed that the patterns cover more than 80% of the phonologically changed expressions used in novels and comics. Furthermore, (iii) to investigate whether phonological change patterns can be effective in characterization, we conducted an experiment that estimated speakers (characters) of the utterances and confirmed that the information on phonological changes improved the performance of speaker estimation for several characters.
We present a novel scheme for word-based Japanese typed dependency parsing which integrates syntactic structure analysis and grammatical function analysis such as predicate-argument structure analysis. Compared to bunsetsu-based dependency parsing, which is predominantly used in Japanese NLP, it provides a natural way of extracting syntactic constituents. This makes it possible to jointly decide dependency and predicate-argument structure, which is usually implemented as two separate steps. By using grammatical functions as dependency types, we can obtain the detailed syntactic information from parsing results, while keeping the converted bunsetsu-based dependency accuracy as high as CaboCha, one of the state-of-the-art dependency parsers.
This paper proposes introducing domain adaptation into Japanese predicate-argument structure (PAS) analysis. Our investigation of a Japanese balanced-corpus revealed that the distribution of argument types differs across text media. The difference is particularly significant when the argument is exophoric. Previous Japanese PAS analysis research has disregarded this tendency as studies have targeted mono-media corpora. This investigation begins with a PAS analyzer based on a recurrent neural network as its baseline and extends it by introducing three kinds of domain-adaptation techniques and their combinations. Evaluation experiments using a Japanese balanced-corpus (BCCWJ-PAS) confirmed the domain dependency of the PAS analysis. The domain adaptation is effective in improving the performance of the Japanese PAS analysis, especially in the the nominative case. The maximum F1 score in the QA text analysis (0.030) improved in comparison to the baseline.
This paper presents a model for Japanese zero anaphora resolution that deals with both intra- and inter-sentential zero anaphora. Our model resolves anaphora for multiple cases simultaneously by utilising and comparing information from other cases. This simultaneous resolution requires the consideration of many combinations of antecedent candidates, which could be a crucial obstacle in both the training and resolving phases. To cope with this problem, we have proposed an effective candidate pruning method using case frame information. We compared the model, which estimates multiple cases simultaneously, by using our proposed candidate pruning method and model, which estimates each case independently without a candidate reduction method in a Japanese balanced corpus. The results confirmed a 0.056-point increase in accuracy. Furthermore, we also confirmed that the introduction of local attention Recurrent Neural Network increases the accuracy of inter-sentential anaphora resolution.