Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

1995Volume 2Issue 2 Pages 1
Published: April 10, 1995
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.2.2_1

JOURNAL FREE ACCESS

Download PDF (92K)
Several Phonemic Features of Written Puns

OSAMU TAKIZAWA

1995Volume 2Issue 2 Pages 3-22
Published: April 10, 1995
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.2.2_3

JOURNAL FREE ACCESS

Show abstractHide abstract

The author aims to develop a pun recognition system as pilot research on machineunderstanding techniques for rhetorical expressions in natural language. This report describes several phonemic features of a type of written puns called the “separately located type”. A pun of this type essentially consists of two separately located words called “actual vehicles” which are phonemically distorted from “restored vehicles”. For example, a pun “I am angary with the Hangaryan.” consists of the former actual vehicle “angary” which is distorted from the restored vehicle “angry”, and the latter actual vehicle “Hangaryan” which is distorted from the restored vehicle “Hungarian”. In this report, the following comparisons are performed for each of 203 puns: (1) The lengths of phoneme sequences of the former actual vehicles are compared with those of the latter ones.(2) The phonemes of the former actual vehicles are compared with those of the latter ones.(3) The phonemes of the actual vehicles are compared with those of the restored vehicles. Results of these comparisons suggest that there are relativery few phonemic distortions but they are regular. These results are useful in developing a pun recognition system.

View full abstract

Download PDF (1839K)
Design of Efficient Unification for Natural Language

Hideto Tomabechi

1995Volume 2Issue 2 Pages 23-58
Published: April 10, 1995
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.2.2_23

JOURNAL FREE ACCESS

Show abstractHide abstract

Graph unification has become a central processing mechanism of many natural language systems due to the popularity of unification-based theories of computational linguistics. Despite the popularity of graph unification as the central processing mechanism, it remains the most expensive part of unification-based natural language processing. Graph unification alone often takes over 90% of total parsing time. As the criteria for efficient unification, we focus on two elements in the design of an efficient unification algorithm: 1) elimination of excessive copying and 2) quick detection of unification failures. We propose a scheme to attain these criteria without expensive overhead for reversing the changes made to the graph node structures based on the notion of quasi-destructive unification. Our experiments using an actual large scale grammar and also using a simulated grammar producing different unification success rates show that the quasi-destructive graph unification algorithm runs roughly at twice as fast as Wroblewski's non-destructive unification algorithm.

View full abstract

Download PDF (3150K)
Integration of Morphological and Syntactic Analysis based on LR Parsing Algorithm

Hozumi Tanaka, Takenobu Tokunaga, Michio Aizawa

1995Volume 2Issue 2 Pages 59-74
Published: April 10, 1995
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.2.2_59

JOURNAL FREE ACCESS

Show abstractHide abstract

Morphological analysis of Japanese is very different from that of English, because no spaces are placed between words. This is also the case in many Asian languages such as Korean, Chinese, Thai and so forth. In the Indo-European family, some languages such as German have the same phenomena in forming complex noun phrases. Processing such languages requires the identification of the word boundaries in the first place. This process is often called segmentation. Segmentation is a very important process, since the wrong segmentation causes fatal errors in the later stages such as syntactic, semantic and contextual analysis. However, correct segmentation is not always possible only with morphological information. Syntactic, semantic and contextual information are also necessary to resolve the ambiguities in segmentation. This paper proposes a method to integrate the morphological and syntactic analysis based on LR parsing algorithm. An LR table derived from grammar rules is modified on the basis of connectabilities between two adjacent words. The modified LR table reflects both the morphological and syntactic constraints. Using the LR table and the generalized LR parsing algorithm, efficient morphological and syntactic analysis is available.

View full abstract

Download PDF (1179K)
A New Direction for Sublanguage N. L. P.

Satoshi Sekine

1995Volume 2Issue 2 Pages 75-87
Published: April 10, 1995
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.2.2_75

JOURNAL FREE ACCESS

Show abstractHide abstract

There have been a number of theoretical studies devoted to the notion of sublanguage. Furthermore, there are some successful natural language processing systemswhich have explicitly or implicitly utilized sublanguage restrictions. However, two big problems are still unsolved to utilize the sublanguage notion: 1) automatic definition and dynamic identification of a text to sublanguage, and 2) automatic linguistic knowledge acquisition for sublanguage. There are now new opportunities to address these problems owing to the appearance of large machine-readable corpora. Although there have been several experiments to try to solve the second problem listed above, the first problem has not received so much attention. In the previous sublanguage N. L. P. systems, the domain the system is dealing with was defined by a human. This is actually one method to define the sublanguage of a text, and, in a sense, it seems to work well. However, it is not always possible and sometimes it may be wrong. In order to maximize the benefit of the sublanguage notion, we need automatic definition and dynamic sublanguage identification. We will report preliminary experiments on sublanguage definition and identification based on lexical appearance. The results of the experiments show that the methods proposed can be useful in processing a new text. In particular, the fact that the first two sentences can reliably identify a text's sublanguage encourages us in further investigation of this line of research. In conclusion, it appears that the inductive definition of sublanguage and sublanguage identification would be beneficial for natural language processing.

View full abstract

Download PDF (1274K)

Register with J-STAGE for free!