Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 9, Issue 2
Displaying 1-6 of 6 articles from this issue
  • [in Japanese]
    2002 Volume 9 Issue 2 Pages 1-2
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (257K)
  • MINORU HARADA, RYO SUZUKI, AKIMIZU MINAMI
    2002 Volume 9 Issue 2 Pages 3-22
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    We propose the technique to realize the information retrieval based on the semantic analysis and actually develop the judicial case retrieval system called JCare. It accepts a query written in Japanese sentences and retrieves a judicial case containing the sentences describing the similar situation specified by the query. It first transforms both a judicial case and the query into semantic graphs that have nodes representing the meaning of word and arcs representing the relations (deep case) between the words. Next, it calculates the similarity between the case and the query by searching the maximum common parts that are topologically equivalent.The graph matching is speeded up by separating each semantic graph into sub-graphs based on the “View” point about a judicial case.
    Download PDF (6411K)
  • KOU MUKAINAKA
    2002 Volume 9 Issue 2 Pages 23-43
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    The model proposed in this paper analyzes meanings of coherence relations using features of verbs and subjects in Japanese complex sentences, and then analyzes structures of coherence relations using the meanings of coherence relation. Dependency structures of Japanese complex sentences are usually analyzed using hierarchical classification of conjunctions and conjunctive particles. But, conjunctions and conjunctive particles usually have multiple senses and are ambiguous. If a conjunction or conjunctive particle in a subordinate clause has a different sense, the subordinate clause may modify a predicate in a different clause. So, the model analyzes the coherence relations between subordinate clauses and a main clause using the features of verbs and subjects, and defines the meanings of coherence relations. Then the meanings of coherence relations are classified according to distance of coherence. The model uses this classification of coherence relations to analyze the structures of coherence relations. Volition, aspect, mood, voice and semantic category etc.are used as the features of verbs, and animate or inanimate etc. is used as the features of subjects. The model is evaluated by examples from actual documents, and shows 98.4% accuracy. Since the model using the classification of conjunctions and conjunctive particles shows 97.0% accuracy with the same examples, the model proposed in this paper, decreases the error rate by half.
    Download PDF (1668K)
  • EIKO YAMAMOTO, KYOJI UMEMURA
    2002 Volume 9 Issue 2 Pages 45-75
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper, we consider the estimation of the one-to-many relationship between entities in corpus. Many works have been done to estimate the relationship between entities from corpus.Generally speaking, the most common method is based on the co-ocurrence of entities in a document of corpus, and this method implicitly assumes that the relationship is one-to-one mapping. The real relationship may sometimes be one-to-many relationship, and need some consideration for this property. We propose to use CSM (Complementary Similarity Measure) to detect this relationship. This measure is originally developed for character recognition system, and is known to work well for overlapped patterns with template pattern, but is rarely used for text processing. We have compared CSM with other similarity measures, including three kinds of mutual information, ∅coefficient, cosine, dice coefficient. and confidence We choose the names of prefectures and cities as the entities, which has real oneto-many relationship. For the evaluation, we have used three kinds of corpora. The first one is a synthesized from real relations. The second one is also svnthesized from relations but it contains an element of false relation. The third one is compiled from actual newspaper corpus. We have found that CSM is the best similarity measure for this experiment and works well for one-to-many relationship.
    Download PDF (4624K)
  • Qujiang Peng, Haodong Wu, Teiji Furugori
    2002 Volume 9 Issue 2 Pages 77-89
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper describes a method for word sense disambiguation using a similarity metric. In this method, we first obtain context-similarity vectors for the senses of a polysemous word using a corpus and also define the context representation for the same polysemous word appearing in text. We then calculate distributional matrix between each context-similarity vector and the context representation for the word to be disambiguated. Finally, comparing the values of distributional matrices, we select the sense with the highest value as the meaning of the polysemous word. An experiment with 682 instances for 10 polysemous words shows that we are able to disambiguate at a rate of almost 92%.
    Download PDF (1124K)
  • Use of convenient tool for detecting differences, MDIFF
    MASAKI MURATA
    2002 Volume 9 Issue 2 Pages 91-110
    Published: April 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Diff is a software program that detects differences between two data sets and is useful for natural language processing.This paper shows several example applications of how Diff can be used to detect differences, extract rewriting rules, merge two different datasets, and matching two different data sets optimally.Since Diff can be applied to a normal UNIX system, it is very easy and convenient to use.Our studies showed that Diff is a practical tool for researching natural language processing.
    Download PDF (1420K)
feedback
Top