Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 2, Issue 2
Displaying 1-5 of 5 articles from this issue
  • [in Japanese]
    1995 Volume 2 Issue 2 Pages 1
    Published: April 10, 1995
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (92K)
  • OSAMU TAKIZAWA
    1995 Volume 2 Issue 2 Pages 3-22
    Published: April 10, 1995
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    The author aims to develop a pun recognition system as pilot research on machineunderstanding techniques for rhetorical expressions in natural language. This report describes several phonemic features of a type of written puns called the “separately located type”. A pun of this type essentially consists of two separately located words called “actual vehicles” which are phonemically distorted from “restored vehicles”. For example, a pun “I am angary with the Hangaryan.” consists of the former actual vehicle “angary” which is distorted from the restored vehicle “angry”, and the latter actual vehicle “Hangaryan” which is distorted from the restored vehicle “Hungarian”. In this report, the following comparisons are performed for each of 203 puns: (1) The lengths of phoneme sequences of the former actual vehicles are compared with those of the latter ones.(2) The phonemes of the former actual vehicles are compared with those of the latter ones.(3) The phonemes of the actual vehicles are compared with those of the restored vehicles. Results of these comparisons suggest that there are relativery few phonemic distortions but they are regular. These results are useful in developing a pun recognition system.
    Download PDF (1839K)
  • Hideto Tomabechi
    1995 Volume 2 Issue 2 Pages 23-58
    Published: April 10, 1995
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Graph unification has become a central processing mechanism of many natural language systems due to the popularity of unification-based theories of computational linguistics. Despite the popularity of graph unification as the central processing mechanism, it remains the most expensive part of unification-based natural language processing. Graph unification alone often takes over 90% of total parsing time. As the criteria for efficient unification, we focus on two elements in the design of an efficient unification algorithm: 1) elimination of excessive copying and 2) quick detection of unification failures. We propose a scheme to attain these criteria without expensive overhead for reversing the changes made to the graph node structures based on the notion of quasi-destructive unification. Our experiments using an actual large scale grammar and also using a simulated grammar producing different unification success rates show that the quasi-destructive graph unification algorithm runs roughly at twice as fast as Wroblewski's non-destructive unification algorithm.
    Download PDF (3150K)
  • Hozumi Tanaka, Takenobu Tokunaga, Michio Aizawa
    1995 Volume 2 Issue 2 Pages 59-74
    Published: April 10, 1995
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Morphological analysis of Japanese is very different from that of English, because no spaces are placed between words. This is also the case in many Asian languages such as Korean, Chinese, Thai and so forth. In the Indo-European family, some languages such as German have the same phenomena in forming complex noun phrases. Processing such languages requires the identification of the word boundaries in the first place. This process is often called segmentation. Segmentation is a very important process, since the wrong segmentation causes fatal errors in the later stages such as syntactic, semantic and contextual analysis. However, correct segmentation is not always possible only with morphological information. Syntactic, semantic and contextual information are also necessary to resolve the ambiguities in segmentation. This paper proposes a method to integrate the morphological and syntactic analysis based on LR parsing algorithm. An LR table derived from grammar rules is modified on the basis of connectabilities between two adjacent words. The modified LR table reflects both the morphological and syntactic constraints. Using the LR table and the generalized LR parsing algorithm, efficient morphological and syntactic analysis is available.
    Download PDF (1179K)
  • Satoshi Sekine
    1995 Volume 2 Issue 2 Pages 75-87
    Published: April 10, 1995
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    There have been a number of theoretical studies devoted to the notion of sublanguage. Furthermore, there are some successful natural language processing systemswhich have explicitly or implicitly utilized sublanguage restrictions. However, two big problems are still unsolved to utilize the sublanguage notion: 1) automatic definition and dynamic identification of a text to sublanguage, and 2) automatic linguistic knowledge acquisition for sublanguage. There are now new opportunities to address these problems owing to the appearance of large machine-readable corpora. Although there have been several experiments to try to solve the second problem listed above, the first problem has not received so much attention. In the previous sublanguage N. L. P. systems, the domain the system is dealing with was defined by a human. This is actually one method to define the sublanguage of a text, and, in a sense, it seems to work well. However, it is not always possible and sometimes it may be wrong. In order to maximize the benefit of the sublanguage notion, we need automatic definition and dynamic sublanguage identification. We will report preliminary experiments on sublanguage definition and identification based on lexical appearance. The results of the experiments show that the methods proposed can be useful in processing a new text. In particular, the fact that the first two sentences can reliably identify a text's sublanguage encourages us in further investigation of this line of research. In conclusion, it appears that the inductive definition of sublanguage and sublanguage identification would be beneficial for natural language processing.
    Download PDF (1274K)
feedback
Top