Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2019 Volume 26 Issue 2 Pages 275-276
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.275

JOURNAL FREE ACCESS

Download PDF (149K)

Paper

Automatically Computable Metrics to Generate Metaphorical Verb Expressions

Akira Miyazawa, Yusuke Miyao

2019 Volume 26 Issue 2 Pages 277-300
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.277

JOURNAL FREE ACCESS

Show abstractHide abstract

The automatic generation of metaphorical expressions helps us write imaginative texts such as poems or novels. This paper proposes a new metaphor generation task, evaluation metrics, and a method to solve the task. Our task is formalized as a problem of finding metaphorical paraphrases for a literal Japanese phrase consisting of a subject, an object, and a verb. We use four evaluation metrics: synonymousness, metaphoricity, novelty, and comprehensibility. Our proposed method generates metaphorical expressions by using three automatically computable scores—similarity, figurativeness, and rarity—corresponding to one of the evaluation metrics. By crowdsourcing, we show how these scores are related to those given by humans in terms of the evaluation metrics and how they are useful in finding human’s preferred expressions in pairwise comparisons.

View full abstract

Download PDF (1281K)
Between Reading Time and Clause Boundaries in Japanese—Wrap-up Effect in a Head-Final Language—

Masayuki Asahara

2019 Volume 26 Issue 2 Pages 301-327
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.301

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a contrastive analysis between reading time and clause boundary categories in the Japanese language in order to estimate text readability. We overlaid reading time data of BCCWJ EyeTrack, and clause boundary categories annotation on the Balanced Corpus of Contemporary Written Japanese. Statistical analysis based on the Bayesian linear mixed model shows that the reading time behaviours differ among the clause boundary categories. The result does not support the wrap-up effects of clause-final words. Another result we arrived at is that the predicate-argument relations facilitate the reading speed of native Japanese speakers.

View full abstract

Download PDF (2080K)
Annotating a Driving Experience Corpus with Behavior and Subjectivity

Ritsuko Iwai, Daisuke Kawahara, Takatsune Kumada, Sadao Kurohashi

2019 Volume 26 Issue 2 Pages 329-359
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.329

JOURNAL FREE ACCESS

Show abstractHide abstract

To communicate with humans in a human-like manner, systems need to understand behavior and psychological states in situations of human-machine interactions, such as in the cases of autonomous driving and nursing robots. We focus on driving situations as they are part of our daily lives and concern safety. To develop such systems, a corpus annotated with behavior and subjectivity in driving situations is necessary. In this study, subjectivity includes emotions, polarity, sentiments, human judgments, perceptions, and cognitions. We construct a driving experience corpus (DEC) (261 blog articles, 8,080 sentences) with four manually annotated tags. First, we annotate spans with driving experience tags (DE). Then, three tags, other’s behavior (OB), self-behavior (SB), and subjectivity (SJ), are annotated within DE spans. In addition to describing the guidelines, we present corpus specifications, agreement between annotators, and three major difficulties during the development: the extended self, important information, and voice in mind. Automatic annotation experiments were conducted on the DEC using Conditional Random Fields-based methods. On the test set, the F-scores were about .55 for both OB and SB and approximately. 75 for SJ, respectively. We provide error analysis that reveals difficulties in interpreting nominatives and differentiating behavior from subjectivity.

View full abstract

Download PDF (1002K)
Unsupervised All-words WSD Using Synonyms and Embeddings

Rui Suzuki, Kanako Komiya, Masayuki Asahara, Minoru Sasaki, Hiroyuki S ...

2019 Volume 26 Issue 2 Pages 361-379
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.361

JOURNAL FREE ACCESS

Show abstractHide abstract

All-words word-sense disambiguation (all-words WSD) involves identifying the senses of all words in a document. Since a word’s sense depends on the context, such as surrounding words, similar words are believed to have similar sets of surrounding words. Therefore, we predict target word senses by calculating Euclidean distances between the target words’ surrounding word vectors and their synonyms using word embeddings. In addition, we replace word tokens in the corpus with their concept tags, that is, article numbers of the Word List by Semantic Principles using prediction results. After that, we create concept embeddings with the concept tag sequence and predict the senses of the target words using the distances between surrounding word vectors, which consist the word and concept embeddings. This paper shows that concept embedding improved the performance of Japanese All-words WSD.

View full abstract

Download PDF (808K)
Detecting Nonstandard Word Usages on Social Media

Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, Manabu Okumura

2019 Volume 26 Issue 2 Pages 381-406
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.381

JOURNAL FREE ACCESS

Show abstractHide abstract

We focus on nonstandard usages of common words on social media, where words, sometimes, are used in a totally different manner from that of their original or standard usage. In this work, we attempt to distinguish nonstandard usages on social media from standard ones in an unsupervised manner. We also constructed new Twitter dataset consisting of 40 words with nonstandard usages and then used the dataset for evaluation in an experiment. For this task, our basic idea is that nonstandard usage can be measured by the inconsistency between the target word’s expected meaning and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and also provide us with findings, for example, on how to construct context embeddings, and which corpus to use.

View full abstract

Download PDF (590K)
Classification of Phonological Changes Reflected in Text: Toward a Characterization of Written Utterances

Chiaki Miyazaki, Satoshi Sato

2019 Volume 26 Issue 2 Pages 407-440
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.407

JOURNAL FREE ACCESS

Show abstractHide abstract

Phonological changes reflected in text can be powerful in characterizing utterances of dialogue agents or characters’ lines in narratives. To use phonological changes to automatically characterize utterances, (i) we collected phonologically changed expressions from characters’ written utterances and (ii) formalized the knowledge required to generate phonologically changed expressions. In particular, we categorized the expressions into 137 patterns by analyzing them from the points of the phenomena concerned and the environments of the occurrences. We experimentally confirmed that the patterns cover more than 80% of the phonologically changed expressions used in novels and comics. Furthermore, (iii) to investigate whether phonological change patterns can be effective in characterization, we conducted an experiment that estimated speakers (characters) of the utterances and confirmed that the information on phonological changes improved the performance of speaker estimation for several characters.

View full abstract

Download PDF (1390K)
Word-based Japanese Typed Dependency Parsing with Grammatical Function Analysis

Takaaki Tanaka, Masaaki Nagata

2019 Volume 26 Issue 2 Pages 441-481
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.441

JOURNAL FREE ACCESS

Show abstractHide abstract

We present a novel scheme for word-based Japanese typed dependency parsing which integrates syntactic structure analysis and grammatical function analysis such as predicate-argument structure analysis. Compared to bunsetsu-based dependency parsing, which is predominantly used in Japanese NLP, it provides a natural way of extracting syntactic constituents. This makes it possible to jointly decide dependency and predicate-argument structure, which is usually implemented as two separate steps. By using grammatical functions as dependency types, we can obtain the detailed syntactic information from parsing results, while keeping the converted bunsetsu-based dependency accuracy as high as CaboCha, one of the state-of-the-art dependency parsers.

View full abstract

Download PDF (1754K)
Domain Adaptation in Japanese Predicate-Argument Structure Analysis considering First and Second Person Exophora

Mizuki Sango, Hitoshi Nishikawa, Takenobu Tokunaga

2019 Volume 26 Issue 2 Pages 483-508
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.483

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes introducing domain adaptation into Japanese predicate-argument structure (PAS) analysis. Our investigation of a Japanese balanced-corpus revealed that the distribution of argument types differs across text media. The difference is particularly significant when the argument is exophoric. Previous Japanese PAS analysis research has disregarded this tendency as studies have targeted mono-media corpora. This investigation begins with a PAS analyzer based on a recurrent neural network as its baseline and extends it by introducing three kinds of domain-adaptation techniques and their combinations. Evaluation experiments using a Japanese balanced-corpus (BCCWJ-PAS) confirmed the domain dependency of the PAS analysis. The domain adaptation is effective in improving the performance of the Japanese PAS analysis, especially in the the nominative case. The maximum F1 score in the QA text analysis (0.030) improved in comparison to the baseline.

View full abstract

Download PDF (565K)
Neural Japanese Zero Anaphora Resolution with Candidate Reduction Using Large-scale Case Frames

Souta Yamashiro, Hitoshi Nishikawa, Takenobu Tokunaga

2019 Volume 26 Issue 2 Pages 509-536
Published: June 15, 2019
Released on J-STAGE: September 15, 2019

DOIhttps://doi.org/10.5715/jnlp.26.509

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a model for Japanese zero anaphora resolution that deals with both intra- and inter-sentential zero anaphora. Our model resolves anaphora for multiple cases simultaneously by utilising and comparing information from other cases. This simultaneous resolution requires the consideration of many combinations of antecedent candidates, which could be a crucial obstacle in both the training and resolving phases. To cope with this problem, we have proposed an effective candidate pruning method using case frame information. We compared the model, which estimates multiple cases simultaneously, by using our proposed candidate pruning method and model, which estimates each case independently without a candidate reduction method in a Japanese balanced corpus. The results confirmed a 0.056-point increase in accuracy. Furthermore, we also confirmed that the introduction of local attention Recurrent Neural Network increases the accuracy of inter-sentential anaphora resolution.

View full abstract

Download PDF (648K)

Register with J-STAGE for free!