Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 10, Issue 2
Displaying 1-8 of 8 articles from this issue
  • [in Japanese]
    2003 Volume 10 Issue 2 Pages 1-2
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (230K)
  • ERI HAYASHI, SUGURU YOSHIOKA, SATOSHI TOJO
    2003 Volume 10 Issue 2 Pages 3-17
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    The objective of this paper is to analyze the temporal structure of a sequence of sentences. As the target of this analysis, we consider cooking recipes. The recipes are considered to be typical examples that prescribe the temporal relations of affairs. However, it seems very difficult to understand the temporal relations without our common knowledge. In order to do this analysis, we utilize aspectual information of each activity. We reclassify aspects, considering the affairs which are specific to cuisine, and define subclasses of perfective aspect. In addition, to enhance the adequacy of analysis, we consider adverbial information, elliptical expressions, and concurrent operations. we design and implement the automatic generation system of ‘time map’ for the cooking recipes.
    Download PDF (3079K)
  • HIROKO INUI, MASAKI MURATA, KIYOTAKA UCHIMOTO, HITOSHI ISAHARA
    2003 Volume 10 Issue 2 Pages 19-42
    Published: April 10, 2003
    Released on J-STAGE: June 07, 2011
    JOURNAL FREE ACCESS
    While the open-ended questionnaire method is a good means to collect free expressions of opinion, the analysis of collected questionnaires is usually done manually, and thus is costly. Furthermore, the results derived from such humans' judgments tend to lack objectivity. Given this background, we are exploring computational approaches to the automatic classification of collected open-ended questionnaires. This paper reports the results of our preliminary experiments, where we used the maximum-entropy model for questionnaire classification. The results show that our method works well for extracting discriminative linguistic expressions for each response type such as proposal, demand, approval, opposition, etc., and can produce questionnaire clusters analogous to those produced by humans.
    Download PDF (2569K)
  • TAKEHIRO TAZOE, TSUTOMU SHIINO, FUMITO MASUI, ATSUO KAWAI
    2003 Volume 10 Issue 2 Pages 43-58
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    We have been studying for the automatic recognition and extraction of metaphor expressions in practical sentences. This paper introduces our metaphorical judgment model for “Noun B like Noun A” expressions. “Noun B like Noun A” expressions are classified into two usages; simile and literality. To automatically judge whether a phrase is simile or literality, “Noun B like Noun A” expressions were classified into six patterns depending on the semantic information of a noun, and the metaphorical judgment model was constructed based on these patterns. When the “Noun B like Noun A” expressions from newspaper articles were judged by the model, and its judgment was compared with the correct judgment, it was approximately 80% correct. Thus, the model was found to be effective in the recognition of metaphor expressions in real-life situations.
    Download PDF (1483K)
  • TETSUZOU UEHARA, MEGUMI KANAZAWA, YASUYUKI USHIO, TOMOKO YAKOO
    2003 Volume 10 Issue 2 Pages 59-78
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    A ‘sou-sakuin’ is a kind of concordance, which gives an alphabetical list of all words used in a book and shows all positions where each word can be found. It is useful as a tool for researching Japanese classics. A corpus with part-of-speech tags, which gives a collection of sentences and their part-of-speech data, is useful as a tool for natural language processing. However, there is no such corpus for Japanese classics. Thus, we try to transform ‘sou-sakuins’ into corpora with part-of-speech tags. Each ‘sou-sakuin’ we used consists of two parts: a text part and an index part. The index part consists of records, each of which has a headword (Kana-string, Kanji-string and part-of-speech data on each word) and an inverted list, which gives line numbers of text part where the word is found. In transformation program, we only use inflection tables for inflective words. We adopt a kind of longest-match method to resolve the problem of occurring of two or more words in a same text line, one of which is sub-string of another word. We also adopt a kind of look-ahead method for the headword's Kanji string which consists of only Kanji characters even if the corresponding text string of the word consists of Kanji and Kana characters. As a result, we got corpora on Japanese classics having about 150, 000 words.
    Download PDF (2236K)
  • HIROSHI MASUICHI, TOMOKO OHKUMA
    2003 Volume 10 Issue 2 Pages 79-109
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper describes a Japanese computational grammar on the basis of Lexical Functional Grammar toward a practical parser with deep analysis for Japanese sentences. The Japanese grammar is characterized by (1) broad coverage even for colloquial or ungrammatical sentences, (2) linguistic preciseness to output f-structures with rich information, and (3) consistency with grammars for other languages. It is a difficult task to develop a computational grammar systematically or in a procedural way, and our grammar writing also depends on our own experience. However, a gradual analysis method using Optimality Theory marks can prevent exceptional rules from causing unexpected analysis results. and a fragment analysis method makes it DOSsible to deal with colloquial or ungrammatical sentences. These two methods give a clear perspective on writing the grammar. We conducted experiments to evaluate our parser with grammatical text (manual text) and colloquial and ungrammatical text (“Voice of Customer” text). The coverage was over 95 %, and the predicate dependency accuracy was 84%.
    Download PDF (2998K)
  • SATORU IKEHARA, JIN'ICHI MURAKAMI, YASUHIRO KIMOTO
    2003 Volume 10 Issue 2 Pages 111-128
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In order to reduce the dimension of VSM (Vector Space Model) for information retrieval and clustering, this paper proposes a new method, Semantic-VSM, which uses the Semantic Attribute System defined by “A-Japanese-Lexicon” instead of literal words used in conventional VSM. The attribute system consists of a tree structure with 2, 710 attributes, which includes 400 thousand literal words. Using this attribute system, the generalization of vector elements can be performed easily based on upper-lower relationships of semantic attributes, so that the dimension can easily be reduced at very low cost. Synonyms are automatically assessed through semantic attributes to improve the recall performance of retrieval systems. Experimental results applying it to BMIR-J2 database of 5, 079 newspaper articles showed that the dimension can be reduced from 2, 710 to 300 or 600 with only a small degradation in performance. High recall performance was also shown compared with conventional VSM.
    Download PDF (1764K)
  • HIROYUKI SHINNOU, MINORU SASAKI
    2003 Volume 10 Issue 2 Pages 129-149
    Published: April 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper, we introduce a free software package SVDPACKC computing the singular value decomposition (SVD) of large sparse matrics. First we explain how to use it, and then solve word sense disambiguation problems by using it. In information retrieval domain, Latent Semantic Indexing (LSI) has actively been researched. LSI maps a high dimensional term vector to the low dimensional concept vectors to overcome synonymy and polysemy problems over information retrieval using vector space model. To build low dimensional concept vectors LSI computes the SVD of term-document matrics. SVDPACKC is a software tool to computes the SVD of large sparse matrics like term-document matrics. LSI compresses a high dimensional future vector to the low dimensional concept vectors, so has many applications besides information retrieval. In this paper, we attack word sense disambiguation problems of 50 verbs in Japanese dictionary task of SENSEVAL2. By using cross validation and LSI, we improved simple Nearest Neighbor method (NN). And we showed that the methods based on NN achieve better precision than the decision list method and Naive Bayes method for some words.
    Download PDF (1942K)
feedback
Top