Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2009 Volume 16 Issue 1 Pages 1_1-1_2
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_1

JOURNAL FREE ACCESS

Download PDF (125K)

Paper

Detection of Quotations and Inserted Clauses and its Application to Dependency Structure Analysis in Spontaneous Japanese

Ryoji Hamabe, Kiyotaka Uchimoto, Tatsuya Kawahara, Hitoshi Isahara

2009 Volume 16 Issue 1 Pages 1_3-1_23
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_3

JOURNAL FREE ACCESS

Show abstractHide abstract

Japanese dependency structure is usually represented by relationships between phrasal units called bunsetsus. One of the biggest problems with dependency structure analysis in spontaneous speech is that clause boundaries are ambiguous. This paper describes a method for detecting the boundaries of quotations and inserted clauses and that for improving the dependency accuracy by applying the detected boundaries to dependency structure analysis. The quotations and inserted clauses are determined by using an SVM-based text chunking method that considers information on morphemes, pauses, etc. The information on automatically analyzed dependency structure is also used to detect the beginning of the clauses. Our evaluation experiment using Corpus of Spontaneous Japanese (CSJ) showed that the automatically estimated boundaries of quotations and inserted clauses helped to improve the accuracy of dependency structure analysis from 77.7% to 78.7% .

View full abstract

Download PDF (431K)
Discernment of Nativeness of English Documents Based on Statistical Hypothesis Testing

Yoichi Tomiura, Sayaka Aoki, Masahiro Shibata, Kensei Yukino

2009 Volume 16 Issue 1 Pages 1_25-1_46
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_25

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a method to discern the nativeness of English documents with high precision based on Bayes decision and a statistical hypothesis testing. Regarding a document as a sequence of part-of-speeches, the proposed method makes a comparison between probabilities of a document by the statistical language model of native English and by that of non-native English to discern the nativeness of the document. The statistical language model used here is a n-gram model. The n-gram model with a large n can be expected to treat well the difference between the native English and the non-native one and has the potential to discern the nativeness with high precision. However, when we use the n-gram model with a large n, the zero frequency problem and the sparseness problem become acute and we cannot rely on the maximum likelihood estimates of n-gram probabilities. The proposed method estimates the ratio of the probability of the document by the native English language model to that by the non-native English language model using a statistical hypothesis testing. The experimental result shows that the proposed method discerns the nativeness with the precision 92.5%, which is significantly higher than by traditional methods.

View full abstract

Download PDF (404K)
Clause Splitting with Conditional Random Fields

Vinh Van Nguyen, Minh Le Nguyen, Akira Shimazu

2009 Volume 16 Issue 1 Pages 1_47-1_65
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_47

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we present a Conditional Random Fields (CRFs) framework for the Clause Splitting problem. We adapt the CRFs model to this problem in order to use very large sets of arbitrary, overlapping and non-independent features. We also extend N-best list by using the Joint-CRFs (Shi and Wang 2007). In addition, we propose the use of rich linguistic information along with a new bottom-up dynamic algorithm for decoding to split a sentence into clauses. The experiments show that our results are competitive with the state-of-the art results.

View full abstract

Download PDF (167K)
A Method of List-type Question-answering Based on the Distribution of Anwer Score Generated by Ranking-type Q/A System

Madoka Ishioroshi, Tatsunori Mori

2009 Volume 16 Issue 1 Pages 1_67-1_100
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_67

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a method of the list-type question-answering. The list-type question-answering is the task in which a system is requested to enumerate all correct answers to given question. In the method, we utilize the distribution of the score that an existing question answering system gives to answer candidates. Answer candidates are separated into some clusters according to their scores. Here, we assume that each cluster results from a probabilistic model. Under the assumption, the parameters of these probabilistic distribution models are estimated by using the EM algorithm. Then, the method judges whether each distribution model is a source of correct answers or a source of incorrect answers. Answer candidates that originate from the distribution models corresponding to correct answers are regarded as final answers. Moreover, by comparing model parameters, we can also judge whether or not the question-answering system appropriately found correct answers. The experimental results show that the use of the score distribution is effective in the list-type question-answering.

View full abstract

Download PDF (950K)
Information Extraction of Hypernyms and Ontology from Dictionaries

Satoshi Suzuki

2009 Volume 16 Issue 1 Pages 1_101-1_116
Published: 2009
Released on J-STAGE: September 14, 2011

DOIhttps://doi.org/10.5715/jnlp.16.1_101

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a method for information extraction of hypernyms from dictionaries, and presents a result of automatic construction of word ontology based on the extracted information. The method recursively expands word definitions to get much larger word sets, which will be candidates of hypernyms of the headwords. At the same time, this method gives likelyhood of the candidates for hypernyms, which is useful for selecting hypernyms from the candidates. Computational experiments showed that the proposed method gives better results than an existing method, which regards HEAD as hypernyms by parsing the explanatory notes. Additionally, we tryed to create a word ontology with the resulting hypernyms. This method is still underconstruction, but the results showed effectiveness of the resulting hypernyms, and showed possibility of entirely automatic construction of word ontology.

View full abstract

Download PDF (483K)

Register with J-STAGE for free!