Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2003 Volume 10 Issue 2 Pages 1-2
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_1

JOURNAL FREE ACCESS

Download PDF (230K)
Automatic Generation of Event Structure for Japanese Cooking Recipes

ERI HAYASHI, SUGURU YOSHIOKA, SATOSHI TOJO

2003 Volume 10 Issue 2 Pages 3-17
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_3

JOURNAL FREE ACCESS

Show abstractHide abstract

The objective of this paper is to analyze the temporal structure of a sequence of sentences. As the target of this analysis, we consider cooking recipes. The recipes are considered to be typical examples that prescribe the temporal relations of affairs. However, it seems very difficult to understand the temporal relations without our common knowledge. In order to do this analysis, we utilize aspectual information of each activity. We reclassify aspects, considering the affairs which are specific to cuisine, and define subclasses of perfective aspect. In addition, to enhance the adequacy of analysis, we consider adverbial information, elliptical expressions, and concurrent operations. we design and implement the automatic generation system of ‘time map’ for the cooking recipes.

View full abstract

Download PDF (3079K)
Classification of Open-Ended Questionnaire Texts based on Surface Expressions

HIROKO INUI, MASAKI MURATA, KIYOTAKA UCHIMOTO, HITOSHI ISAHARA

2003 Volume 10 Issue 2 Pages 19-42
Published: April 10, 2003
Released on J-STAGE: June 07, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_19

JOURNAL FREE ACCESS

Show abstractHide abstract

While the open-ended questionnaire method is a good means to collect free expressions of opinion, the analysis of collected questionnaires is usually done manually, and thus is costly. Furthermore, the results derived from such humans' judgments tend to lack objectivity. Given this background, we are exploring computational approaches to the automatic classification of collected open-ended questionnaires. This paper reports the results of our preliminary experiments, where we used the maximum-entropy model for questionnaire classification. The results show that our method works well for extracting discriminative linguistic expressions for each response type such as proposal, demand, approval, opposition, etc., and can produce questionnaire clusters analogous to those produced by humans.

View full abstract

Download PDF (2569K)
The Metaphorical Judgment Model for “Noun B like Noun A” Expressions

TAKEHIRO TAZOE, TSUTOMU SHIINO, FUMITO MASUI, ATSUO KAWAI

2003 Volume 10 Issue 2 Pages 43-58
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_43

JOURNAL FREE ACCESS

Show abstractHide abstract

We have been studying for the automatic recognition and extraction of metaphor expressions in practical sentences. This paper introduces our metaphorical judgment model for “Noun B like Noun A” expressions. “Noun B like Noun A” expressions are classified into two usages; simile and literality. To automatically judge whether a phrase is simile or literality, “Noun B like Noun A” expressions were classified into six patterns depending on the semantic information of a noun, and the metaphorical judgment model was constructed based on these patterns. When the “Noun B like Noun A” expressions from newspaper articles were judged by the model, and its judgment was compared with the correct judgment, it was approximately 80% correct. Thus, the model was found to be effective in the recognition of metaphor expressions in real-life situations.

View full abstract

Download PDF (1483K)
A Transformation of Concordance Data on Japanese Classics to Corpus Tagged with Part-of-Speech

TETSUZOU UEHARA, MEGUMI KANAZAWA, YASUYUKI USHIO, TOMOKO YAKOO

2003 Volume 10 Issue 2 Pages 59-78
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_59

JOURNAL FREE ACCESS

Show abstractHide abstract

A ‘sou-sakuin’ is a kind of concordance, which gives an alphabetical list of all words used in a book and shows all positions where each word can be found. It is useful as a tool for researching Japanese classics. A corpus with part-of-speech tags, which gives a collection of sentences and their part-of-speech data, is useful as a tool for natural language processing. However, there is no such corpus for Japanese classics. Thus, we try to transform ‘sou-sakuins’ into corpora with part-of-speech tags. Each ‘sou-sakuin’ we used consists of two parts: a text part and an index part. The index part consists of records, each of which has a headword (Kana-string, Kanji-string and part-of-speech data on each word) and an inverted list, which gives line numbers of text part where the word is found. In transformation program, we only use inflection tables for inflective words. We adopt a kind of longest-match method to resolve the problem of occurring of two or more words in a same text line, one of which is sub-string of another word. We also adopt a kind of look-ahead method for the headword's Kanji string which consists of only Kanji characters even if the corresponding text string of the word consists of Kanji and Kana characters. As a result, we got corpora on Japanese classics having about 150, 000 words.

View full abstract

Download PDF (2236K)
Constructing a practical Japanese Parser based on Lexical Functional Grammar

HIROSHI MASUICHI, TOMOKO OHKUMA

2003 Volume 10 Issue 2 Pages 79-109
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_79

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes a Japanese computational grammar on the basis of Lexical Functional Grammar toward a practical parser with deep analysis for Japanese sentences. The Japanese grammar is characterized by (1) broad coverage even for colloquial or ungrammatical sentences, (2) linguistic preciseness to output f-structures with rich information, and (3) consistency with grammars for other languages. It is a difficult task to develop a computational grammar systematically or in a procedural way, and our grammar writing also depends on our own experience. However, a gradual analysis method using Optimality Theory marks can prevent exceptional rules from causing unexpected analysis results. and a fragment analysis method makes it DOSsible to deal with colloquial or ungrammatical sentences. These two methods give a clear perspective on writing the grammar. We conducted experiments to evaluate our parser with grammatical text (manual text) and colloquial and ungrammatical text (“Voice of Customer” text). The coverage was over 95 %, and the predicate dependency accuracy was 84%.

View full abstract

Download PDF (2998K)
Vector Space Model bsased on Semantic Attributes of Words

SATORU IKEHARA, JIN'ICHI MURAKAMI, YASUHIRO KIMOTO

2003 Volume 10 Issue 2 Pages 111-128
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_111

JOURNAL FREE ACCESS

Show abstractHide abstract

In order to reduce the dimension of VSM (Vector Space Model) for information retrieval and clustering, this paper proposes a new method, Semantic-VSM, which uses the Semantic Attribute System defined by “A-Japanese-Lexicon” instead of literal words used in conventional VSM. The attribute system consists of a tree structure with 2, 710 attributes, which includes 400 thousand literal words. Using this attribute system, the generalization of vector elements can be performed easily based on upper-lower relationships of semantic attributes, so that the dimension can easily be reduced at very low cost. Synonyms are automatically assessed through semantic attributes to improve the recall performance of retrieval systems. Experimental results applying it to BMIR-J2 database of 5, 079 newspaper articles showed that the dimension can be reduced from 2, 710 to 300 or 600 with only a small degradation in performance. High recall performance was also shown compared with conventional VSM.

View full abstract

Download PDF (1764K)
Introduction of SVDPACKC and its application to word sense disambiguation problems

HIROYUKI SHINNOU, MINORU SASAKI

2003 Volume 10 Issue 2 Pages 129-149
Published: April 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.2_129

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we introduce a free software package SVDPACKC computing the singular value decomposition (SVD) of large sparse matrics. First we explain how to use it, and then solve word sense disambiguation problems by using it. In information retrieval domain, Latent Semantic Indexing (LSI) has actively been researched. LSI maps a high dimensional term vector to the low dimensional concept vectors to overcome synonymy and polysemy problems over information retrieval using vector space model. To build low dimensional concept vectors LSI computes the SVD of term-document matrics. SVDPACKC is a software tool to computes the SVD of large sparse matrics like term-document matrics. LSI compresses a high dimensional future vector to the low dimensional concept vectors, so has many applications besides information retrieval. In this paper, we attack word sense disambiguation problems of 50 verbs in Japanese dictionary task of SENSEVAL2. By using cross validation and LSI, we improved simple Nearest Neighbor method (NN). And we showed that the methods based on NN achieve better precision than the decision list method and Naive Bayes method for some words.

View full abstract

Download PDF (1942K)

Register with J-STAGE for free!