Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 8, Issue 4
Displaying 1-6 of 6 articles from this issue
  • [in Japanese]
    2001 Volume 8 Issue 4 Pages 1-2
    Published: October 10, 2001
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (224K)
  • The Degree and the Process of the Uptake through Chiming-in “Hai”
    KOUICHI DOI, AKIRA OHMORI
    2001 Volume 8 Issue 4 Pages 3-17
    Published: October 10, 2001
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    It was pointed out by Austin that the linguistic phenomenon of the “uptake” was important for an analysis of the speech act in conversation. The “uptake”, however, has not been analyzed sufficiently.This paper has analyzed the “uptake” in terms of pragmatics.We have extended the speech act theory by Austin and Searle in order to analyze the “uptake” in terms of pragmatics, and then proposed a framework of the extended speech act theory.It has the following features:
    ·It newly incorporates two conceptual elements (a hidden prepositional act and an intention) into the existing speech act theory.
    · The perlocutionary act and the perlocutionary effect in the existing speech act theory are divided into two kinds of act and into four kinds of effect, respectively.
    As a result, the framework of the extended speech act theory has 13 conceptual elements in total.Based on the framework proposed, a diversity of meanings of “hai” in Japanese, which is a typical representation of the “uptake”, has been investigated in terms of pragmatics, in the context of the degree and process of the “uptake”. We have found eight levels in the degree of the “uptake”, and seven stages in the process of the “uptake”.
    Download PDF (3650K)
  • MASAO UTIYAMA, HITOSHI ISAHARA
    2001 Volume 8 Issue 4 Pages 19-36
    Published: October 10, 2001
    Released on J-STAGE: June 07, 2011
    JOURNAL FREE ACCESS
    A text is usually composed of multiple topics. Segmenting such a text into coherent topics is useful both for information retrieval and for automatic text summarization. This paper proposes a statistical method that selects the segmentation of the highest probability among possible segmentations as the best segmentation of the given text. Since the method estimates probabilities of segmentations from the given text, it does not need training data. Therefore, it can be applied to any text in any domain. The effectiveness of the method was confirmed through twoexperiments. The firstexperiment evaluated the accuracy of the method by using publicly available data. The experimental results showed that the accuracy of the proposed method is at least as good as that of a state-of-the-art text segmentation system. The second experiment compared the segmentations done by our method with those of original segments in relatively long documents. When we compared our system's segmentations with chapters in the documents, the accuracy was 0.37 on the condition that we regarded only exact matches as correct matches. If we regarded ±1 line differences as correct then the accuracy was 0.49. When we compared our system's segmentations with sections, the accuracies were 0.34 and 0.51, respectively. These results show that our method is effective for domain independent text segmentation.
    Download PDF (1792K)
  • JUN OKAMOTO, SHUN ISHIZAKI
    2001 Volume 8 Issue 4 Pages 37-54
    Published: October 10, 2001
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Background knowledge concerning to the input text is necessary when a computer tries to understand the text as well as syntactic and semantic information about it. This paper presents a method to construct an associative concept dictionary using large-scale association experiments. The dictionary includes semantic and contextual information about the stimulus words. In the association experiments, 100 stimulus words from the textbook of Japanese language used in elementary schools are given to subjects. They are requested to make association from the stimulus words about 7 tasks for each word. The tasks, for example, are higher level concepts, lower level concepts, actions, situations and so on. Conventional concept dictionaries have tree structures to express its hierarchical ones. Distances between concepts are calculated using number of links between the concepts. This paper shows a way to formulate the distance between concepts by using a linear programming method. Its parameters, especially frequency of the associated word and associated order of the word, are found significant for the distance calculation. By comparing the associative concept dictionary with EDR concept dictionary and WordNet using the distance information, it is found that the dictionary is more similar to WordNet than EDR.
    Download PDF (2680K)
  • TAKEHIKO YOSHIMI
    2001 Volume 8 Issue 4 Pages 55-70
    Published: October 10, 2001
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    One of the major factors causing unnatural translation in English-to-Japanese MT systems is the literal translation of verb-derived nominal constructions. This paper shows a method of automatically rewriting the nominalization (the packed form) into the one less packed, leading to the generation of a more natural and suitable Japanese. We have carried out an experiment mainly centered upon nominal constructions where the head deverbal noun is pre-modified by a genitive noun and postmodified by an “of” prepositional phrase. Having combined the proposed method with our system Power E/J, we found that the method improved the quality of translation for 33 out of the tested 49 sentences (67.3%) which include the rewritten constructions. In previously proposed methods, measures against the unnaturalness have often been taken at the transfer stage embedded in the systems. The advantage of our method over the previous approach is that it is applicable not only to our MT system but also to several other systems. An experiment with commercial MT systems other than ours shows that the incorporation of the pre-editing module has satisfactorily improved the quality of translations in two out of the three systems.
    Download PDF (1783K)
  • YUKIYOSHI HIROSE, KAZUHIKO OZEKI, KAZUYUKI TAKAGI
    2001 Volume 8 Issue 4 Pages 71-89
    Published: October 10, 2001
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Prosody contains information that is lost when utterances are transcribed into letters or characters. Such information may be useful for syntactic analysis of spoken sentences. In our previous work, we took up 12 prosodic features, and made a statistical model to represent the relationship between those features and dependency distances. Then, using a dependency analyzer that incorporates the model, we have shown that prosodic information is in fact effective for dependency analysis of read Japanese sentences. In the present work, we employed 24 features including new ones, and conducted an extensive search for effective ones. Also, the statistical model was modified to better fit the actual distributions of the feature values. As a result, in open experiments using the ATR 503-sentence database, the correct parsing rate was improved by 21.2% with the use of the prosodic features. This figure is 4.0 points higher than the improvement in the previous experiment of our group. Among the features, the duration of pause was definitely effective in both the open and the closed experiments, while the effectiveness of other features related to the pitch, the power, and the speaking rate, when used together with the duration of pause, was not clear in the open experiments.
    Download PDF (1887K)
feedback
Top