Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2002Volume 9Issue 1 Pages 1-2
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.1

JOURNAL FREE ACCESS

Download PDF (192K)
Case Frame Construction by Coupling the Predicate and its Closest Case Component

DAISUKE KAWAHARA, SADAO KUROHASHI

2002Volume 9Issue 1 Pages 3-19
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.3

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes a method to construct a case frame dictionary automatically from a raw corpus.The main problem is how to handle the diversity of verb usages. We collect predicate-argument examples, which are distinguished by the verb and its closest case component in order to deal with verb usages, from parsed results of a corpus.Furthermore, we cluster and merge predicate-argument examples which do not have different usages but belong to different case frames because of different closest case components.We also report on an experimental result of case structure analysis using the constructed case frame dictionary.

View full abstract

Download PDF (1560K)
A Retrieval Method of Signs Based on Similarity between Manual Motion Descriptions

HISAHIRO ADACHI

2002Volume 9Issue 1 Pages 21-41
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.21

JOURNAL FREE ACCESS

Show abstractHide abstract

Sign language is an important path for us to communicate with hearing impaired people. Therefore, learners has been increasing in recent years. There are several researches into learning aid systems and electronic dictionaries for sign language. Especially, when users want to look up the Japanese word labels corresponding with manual motion properties, most of previous retrieval methods are necessary to set various and many retrieval conditions in detail. There is a serious problem that it is hard for beginners to look up the most appropriate sign by the setting wrong conditions. To overcome this problem, this paper proposes a method which uses a different approach from the previous methods. The point of the method is based on the similarity between manual motion descriptions (MMDs) appeared in ordinary sign dictionaries. By computing the similarity between an inputted MMD as a query and the MMDs in the database, retrieval results are outputted in similarity order. The retrieval results formed by the similarity can be considered as a set of signs that are similar to each other. As an interesting point, a subject of sign retrieval can be considered as the document retrieval. The results of evaluation experiments show the applicability and usability of the proposed method. We also discuss a problem that there are ambiguous MMDs by demonstrating examples.

View full abstract

Download PDF (2395K)
Efficient Spoken Dialogue Control under System's Limited Knowledge

KOHJI DOHSAKA, NORIHOTO YASUDA, KIYOAKI AIKAWA

2002Volume 9Issue 1 Pages 43-63
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.43

JOURNAL FREE ACCESS

Show abstractHide abstract

We present a dialogue control method called the “dual-cost method”, by which a spoken dialogue system conveys information relevant to a user request by a concise dialogue within the confines of the system's knowledge stored in its database. Due to speech recognition errors, a system has to carry out a “confirmation dialogue” to clarify the user request. A confirmation dialogue should be concise since a lengthy one destroys the flow of the overall dialogue. There are cases where the user request is beyond the system's knowledge since a user does not know what knowledge the system has. In such cases, conventional methods have a problem of invoking unnecessary confirmations since they attempt to confirm the whole contents of the request. To resolve this problem, we introduce the notions of confirmation cost and information transfer cost. The confirmation cost is the length of a confirmation dialogue and depends on the speech recognition rate. The information transfer cost is the length of a system response and depends on the system's knowledge. The dual-cost method controls a dialogue based on the minimization of these two costs and can avoid unnecessary exchanges, which are inevitable in conventional methods.

View full abstract

Download PDF (2303K)
Learning to Combine Outputs of Multiple Japanese Named Entity Extractors

TAKEHITO UTSURO, MANABU SASSANO, KIYOTAKA UCHIMOTO

2002Volume 9Issue 1 Pages 65-100
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.65

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a method for learning a classifier which combines outputs of more than one Japanese named entity extractors. The proposed combination method belongs to the family of stacked generalizers, which is in principle a technique of combining outputs of several classifiers at the first stage by learning a second stage classifier to combine those outputs at the first stage. Individual models to be combined are based on maximum entropy models, one of which always considers surrounding contexts of a fixed length, while the other considers those of variable lengths according to the number of constituent morphemes of named entities. As an algorithm for learning the second stage classifier, we employ a decision list learning method. Experimental evaluation shows that the proposed method achieves improvement over the best known results with Japanese named entity extractors based on maximum entropy models.

View full abstract

Download PDF (3651K)
Extracting Bilingual Word Pairs with Maximum Entropy Modeling

KENGO SATO, HIROAKI SAITO

2002Volume 9Issue 1 Pages 101-115
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.101

JOURNAL FREE ACCESS

Show abstractHide abstract

Translation dictionaries used in multilingual natural language processing such as machine translation have been made manually, but a great deal of labor is required for this work and it is difficult to keep the description of the dictionaries consistent. Therefore, researches of extracting bilingual word pairs from parallel corpora automatically become active recently. In this paper, we propose a learning and extracting method of bilingual word pairs from aligned parallel corpora with the maximum entropy modeling. We define a probabilistic model of bilingual word pairs and four types of feature functions which express statistical and linguistic properties such as co-occurrence information and morphlogical information. Co-occurrence information restricts the sense of words. Morphlogical information restricts the part-of-speech of words. Experiment results in which Japanese and English parallel corpora are used show that our method performs better than the previous methods and can extract the bilingual word pairs which do not appear in the training corpus with almost the same accuracy as the appeared pairs due to the property of the maximum entropy modeling.

View full abstract

Download PDF (1371K)
Classification of Syntactic and Semantic Usage of Japanese Abstract Nouns and Their Translations

SATORU IKEHARA, JIN'ICHI MURAKAMI, NOBORU KURUMAI

2002Volume 9Issue 1 Pages 117-134
Published: January 10, 2002
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.9.117

JOURNAL FREE ACCESS

Show abstractHide abstract

Translation of highly abstract nouns has been one of the most difficult problems in Japanese to English machine translation.In order to develop the translation rules, meanings and usage of typical Japanese abstract nouns of “no”, “koto”, “mono”, “tokoro”, “toki”, “wake” were studied and syntactic and semantic usage were classified. First, taking notice of the semantic equivalency of “no” and other abstract nouns, exchange rules for noun “no” was studied. Next, semantic usage and syntactic usage of these abstract nouns were analyzed and translations for them were summalized.
The results were applied to 741 expressions extracted from newspaper and translation quality was evaluated.The results showed that the quality for exchange rules was 96.9%and the classification accuracy of 5 abstract nouns was 73%in average. The results are very useful to improve the translation quality of abstract nouns.

View full abstract

Download PDF (1869K)

Register with J-STAGE for free!