Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 9, Issue 1
Displaying 1-7 of 7 articles from this issue
  • [in Japanese]
    2002 Volume 9 Issue 1 Pages 1-2
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (192K)
  • DAISUKE KAWAHARA, SADAO KUROHASHI
    2002 Volume 9 Issue 1 Pages 3-19
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper describes a method to construct a case frame dictionary automatically from a raw corpus.The main problem is how to handle the diversity of verb usages. We collect predicate-argument examples, which are distinguished by the verb and its closest case component in order to deal with verb usages, from parsed results of a corpus.Furthermore, we cluster and merge predicate-argument examples which do not have different usages but belong to different case frames because of different closest case components.We also report on an experimental result of case structure analysis using the constructed case frame dictionary.
    Download PDF (1560K)
  • HISAHIRO ADACHI
    2002 Volume 9 Issue 1 Pages 21-41
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Sign language is an important path for us to communicate with hearing impaired people. Therefore, learners has been increasing in recent years. There are several researches into learning aid systems and electronic dictionaries for sign language. Especially, when users want to look up the Japanese word labels corresponding with manual motion properties, most of previous retrieval methods are necessary to set various and many retrieval conditions in detail. There is a serious problem that it is hard for beginners to look up the most appropriate sign by the setting wrong conditions. To overcome this problem, this paper proposes a method which uses a different approach from the previous methods. The point of the method is based on the similarity between manual motion descriptions (MMDs) appeared in ordinary sign dictionaries. By computing the similarity between an inputted MMD as a query and the MMDs in the database, retrieval results are outputted in similarity order. The retrieval results formed by the similarity can be considered as a set of signs that are similar to each other. As an interesting point, a subject of sign retrieval can be considered as the document retrieval. The results of evaluation experiments show the applicability and usability of the proposed method. We also discuss a problem that there are ambiguous MMDs by demonstrating examples.
    Download PDF (2395K)
  • KOHJI DOHSAKA, NORIHOTO YASUDA, KIYOAKI AIKAWA
    2002 Volume 9 Issue 1 Pages 43-63
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    We present a dialogue control method called the “dual-cost method”, by which a spoken dialogue system conveys information relevant to a user request by a concise dialogue within the confines of the system's knowledge stored in its database. Due to speech recognition errors, a system has to carry out a “confirmation dialogue” to clarify the user request. A confirmation dialogue should be concise since a lengthy one destroys the flow of the overall dialogue. There are cases where the user request is beyond the system's knowledge since a user does not know what knowledge the system has. In such cases, conventional methods have a problem of invoking unnecessary confirmations since they attempt to confirm the whole contents of the request. To resolve this problem, we introduce the notions of confirmation cost and information transfer cost. The confirmation cost is the length of a confirmation dialogue and depends on the speech recognition rate. The information transfer cost is the length of a system response and depends on the system's knowledge. The dual-cost method controls a dialogue based on the minimization of these two costs and can avoid unnecessary exchanges, which are inevitable in conventional methods.
    Download PDF (2303K)
  • TAKEHITO UTSURO, MANABU SASSANO, KIYOTAKA UCHIMOTO
    2002 Volume 9 Issue 1 Pages 65-100
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper, we propose a method for learning a classifier which combines outputs of more than one Japanese named entity extractors. The proposed combination method belongs to the family of stacked generalizers, which is in principle a technique of combining outputs of several classifiers at the first stage by learning a second stage classifier to combine those outputs at the first stage. Individual models to be combined are based on maximum entropy models, one of which always considers surrounding contexts of a fixed length, while the other considers those of variable lengths according to the number of constituent morphemes of named entities. As an algorithm for learning the second stage classifier, we employ a decision list learning method. Experimental evaluation shows that the proposed method achieves improvement over the best known results with Japanese named entity extractors based on maximum entropy models.
    Download PDF (3651K)
  • KENGO SATO, HIROAKI SAITO
    2002 Volume 9 Issue 1 Pages 101-115
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Translation dictionaries used in multilingual natural language processing such as machine translation have been made manually, but a great deal of labor is required for this work and it is difficult to keep the description of the dictionaries consistent. Therefore, researches of extracting bilingual word pairs from parallel corpora automatically become active recently. In this paper, we propose a learning and extracting method of bilingual word pairs from aligned parallel corpora with the maximum entropy modeling. We define a probabilistic model of bilingual word pairs and four types of feature functions which express statistical and linguistic properties such as co-occurrence information and morphlogical information. Co-occurrence information restricts the sense of words. Morphlogical information restricts the part-of-speech of words. Experiment results in which Japanese and English parallel corpora are used show that our method performs better than the previous methods and can extract the bilingual word pairs which do not appear in the training corpus with almost the same accuracy as the appeared pairs due to the property of the maximum entropy modeling.
    Download PDF (1371K)
  • SATORU IKEHARA, JIN'ICHI MURAKAMI, NOBORU KURUMAI
    2002 Volume 9 Issue 1 Pages 117-134
    Published: January 10, 2002
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Translation of highly abstract nouns has been one of the most difficult problems in Japanese to English machine translation.In order to develop the translation rules, meanings and usage of typical Japanese abstract nouns of “no”, “koto”, “mono”, “tokoro”, “toki”, “wake” were studied and syntactic and semantic usage were classified. First, taking notice of the semantic equivalency of “no” and other abstract nouns, exchange rules for noun “no” was studied. Next, semantic usage and syntactic usage of these abstract nouns were analyzed and translations for them were summalized.
    The results were applied to 741 expressions extracted from newspaper and translation quality was evaluated.The results showed that the quality for exchange rules was 96.9%and the classification accuracy of 5 abstract nouns was 73%in average. The results are very useful to improve the translation quality of abstract nouns.
    Download PDF (1869K)
feedback
Top