Universal Dependencies (UD) is an international project to develop multilingual dependency treebanks in a uniform annotation scheme, aiming at cross lingual learning from multilingual corpora and quantitative comparison of languages. As of mid 2018, more than 100 corpora for about 60 languages have been released. This paper describes the definition of annotations for Japanese. We discuss the localization issues of PoS tags, case marking dependency labels and the difference between phrase and clause in Japanese. We present the issues of coordination structures, which cannot be represented solely by the dependency tree structures. We also report the current status of UD Japanese corpora we have constructed.
Questions are asked in many situations such as sessions at conferences and inquiries through emails. In such situations, questions can be often lengthy and hard to understand, because they often contain peripheral information in addition to the main focus of the question. Thus, we propose the task of question summarization. In this research, we firstly analyzed question-summary pairs extracted from a Community Question Answering (CQA) site, and found that there exists the questions that cannot be summarized by extractive approaches, but abstractive approaches are required. We created a dataset by regarding the question-title pairs posted on a CQA site as question-summary pairs. By using the data, we trained extractive and abstractive summarization models, and compared them based on the ROUGE score and manual evaluation. Our experimental results show an abstractive method, the encoder-decoder with the copying mechanism, achieves better scores both on ROUGE-2 F-measure and the evaluation by human judges.
We proposed a model that classifies discussion discourse acts using discussion patterns with neural networks. Several attempts have been made in earlier works to analyze texts that are used in various discussions. The importance of discussion patterns has been explored in those works but their methods required a sophisticated design to combine pattern features with a classifier. Our model introduces tree learning approaches and a graph learning approach to capture discussion patterns without pattern features. In an evaluation to classify discussion discourse acts in Reddit, the model achieved improvements of 1.5% in accuracy and 2.2 in F1 score compared to the previous best model. We further analyzed the model using an attention mechanism to inspect interactions among different learning approaches.
Combinatory Categorial Grammar (CCG) is a strongly lexicalized grammatical formalism, in which the vast majority of parsing decisions involve assigning a supertag to indicate the correct syntactic role. We propose an A* parsing model for CCG that exploits this characteristics, by modeling the probability of a tree through the supertags and resolving the remaining ambiguities by its syntactic dependencies. The key of our method is that it predicts the probabilities of supertags and dependency heads independently using a strong unigram model defined over bi-directional LSTMs. The factorization allows precomputation of probabilities for all possible trees for a sentence, which, combined with an A* algorithm, enables very efficient decoding. The proposed model achieves the state-of-the-art results on English and Japanese CCG parsing. In addition, we conduct Recognizing Textual Entailment (RTE) experiments by integrating the proposed parser within logic-based RTE systems. We demonstrate that such integration leads to improved performance in English RTE experiments.
This study focuses on database (DB) search dialogue and proposes to employ user utterance information that does not directly mention the DB field of the back-end system but is useful for constructing DB queries. We name this type of information implicit conditions, the interpretation of which enables the dialogue system to be more natural and efficient in communication with humans. We formalise the interpretation of implicit conditions as the classification of user utterances into the related DB field while simultaneously identifying the evidence for such classification. Introducing this new task is one of the contributions of this paper. We implemented three models for this task: an SVM-based model, an RCNN-based model and a sequence-to-sequence model with an attention mechanism. In an evaluation via a corpus of simulated dialogues between a real estate agent and a customer, the sequence-to-sequence model outperformed than the other models.
Word-order differences between source and target languages significantly affect statistical machine translation. This problem can be effectively addressed by preordering. A state-of-the-art preordering method would involve manually designed feature templates. In this paper, we propose a method that uses a recursive neural network that can learn end-to-end preordering. English-Japanese, English-French, and English-Chinese datasets are extensively evaluated. The results confirm that this method achieves an English-to-Japanese translation quality that is comparable with that of the state-of-the-art method, without manually designed feature templates. In addition, a detailed analysis examines the factors affecting preordering and translation quality as well as the effects of preordering in neural machine translation.
Various past, present, and future events are described in text. To understand such text, correct interpretation of temporal information is essential. For this purpose, many corpora associating events with temporal information have been constructed. Although these corpora focus on expressions with strong temporality, many expressions have weak temporality, but are still clues to understanding temporal information. In this article, we propose an annotation scheme that comprehensively anchors textual expressions to the time axis. Using this scheme, we annotated 113 documents in the Kyoto University Text Corpus. Because the corpus has already been annotated with predicate-argument structures and coreference relations, it can now be utilized for integrated information analysis of events, entities, and time.
This paper proposes a new attention mechanism for neural machine translation (NMT) based on convolutional neural networks (CNNs), which is inspired by the CKY algorithm. The proposed attention represents every possible combination of source words (e.g., phrases and structures) through CNNs, which imitates the CKY table in the algorithm. NMT, incorporating the proposed attention, decodes a target sentence on the basis of the attention scores of the hidden states of CNNs. The proposed attention enables NMT to capture alignments from underlying structures of a source sentence without sentence parsing. The evaluations on the Asian Scientific Paper Excerpt Corpus (ASPEC) English-Japanese translation task show that the proposed attention gains 1.43 points in BLEU as compared to a conventional attention-based encoder decoder model. Furthermore, the proposed attention is at least comparable to, or better than, a conventional attention-based encoder decoder model on the FBIS Chinese-English translation task.
Recently, dependency parsers with neural networks have outperformed existing parsers. When these parsing models are applied to Chinese sentences, they are used in a pipeline model with word segmentation and POS tagging models. In such cases, parsing models do not work well because of word segmentation and POS tagging errors. This can be solved by joint models of word segmentation, POS tagging and dependency parsing. In addition to this, Chinese characters have their own meanings, so the meanings of characters, character strings and sub-words are as important as the meanings of words in dependency parsing. In this study, we propose a neural network-based joint word-segmentation, POS tagging and dependency parsing model in addition to a joint word-segmentation and POS tagging model. We exploit not only word and character embeddings but also character string embeddings in all our models.