Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2003Volume 10Issue 5 Pages 1-2
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_1

JOURNAL FREE ACCESS

Download PDF (214K)
A Method of LR Table Compaction for Natural Language Processing

TOMOYOSI AKIBA, KATUNOBU ITOU

2003Volume 10Issue 5 Pages 3-21
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_3

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper presents a method to reduce the size of the parsing table used in the LR parsing algorithm. The proposed method has the following significant characteristics; (1) that it can be applied along with any other methods for the parsing table reduction already known today, (2) that the parsing tables constructed by it can be used in the existing LR parser without modification, and (3) that it does not affect the parsing results and the parsing efficiency. We applied the method to construct the reduced LR table from some existing grammars used for NLP, and compared the produced LR tables with the tables constructed by the ordinary method. Our method showed that the produced tables had the sizes of between 60% and 25% of their original sizes according to the grammars.

View full abstract

Download PDF (1767K)
A Spoken Dialogue Interface through Natural and Efficient Responses

KUMIKO OHMORI, HIROAKI SAITO

2003Volume 10Issue 5 Pages 23-40
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_23

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a new dialogue control method with “presuppositional responses” to realize a large number of target words towards an efficient spoken dialogue interface. This strategy comes from human characteristics in that people tend to presuppose the utterance to be familiar or frequently-spoken. The strategy is verified through huge data of human recognition of 160, 000 sir names. We introduce heuristics to determine what words are to be presuppositional; presuppositional words should cover as many frequently-used ones as possible, while they should be small for high-accurate speech recognition. We report a successful implementation of a dialogue interface using a conventional speech recognition device. We resolve the situations when speech recognition fails or when the corrent answer is not included in presuppositional words in order not to irritate the user with unnecessary or detoured questions. Realtime and natural responses are attained through parallel search of non-frequent words as well as presuppositional ones.

View full abstract

Download PDF (1976K)
Efficient Multidimensional Indexing Using One-dimensional Self-Organizing Maps

KENJI KITA, MASAMI SHISHIBORI

2003Volume 10Issue 5 Pages 41-54
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_41

JOURNAL FREE ACCESS

Show abstractHide abstract

Nearest neighbor search in high dimensional spaces is an interesting and important problem which is relevant for a wide variety of applications, including multimedia information retrieval, data mining, and pattern recognition. For such applications, the curse of high dimensionality tends to be a major obstacle in the development of efficient indexing methods. This paper addresses the problem of designing an efficient multidimensional indexing structure for high dimensional nearest neighbor search. More specifically, using self-organizing maps (SOM), high-dimensional vector data are first transformed into one-dimensional units while preserving the higher order topology by mapping similar data items to the same or the neighboring unit. Then, given a query vector, only data items whose location is close to the unit location of the query are considered as candidates. Experimental results indicate that our scheme scales well even for a very large number of dimensions.

View full abstract

Download PDF (2404K)
Quantitative Analysis of Morpholexical Difference between Human-Translated and Machine-Translated Sentences

TAKEHIKO YOSHIMI

2003Volume 10Issue 5 Pages 55-74
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_55

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper carries out a quantitative analysis of morpholexical difference between machine-translated Japanese sentences and human-translated ones, both of which are obtained from English sentences selected randomly from news articles. The analysis gives the following results.(1) A tendency to translate one English sentence into multiple Japanese sentences is less observed in machine translation than in human translation.(2) Significant difference exists in the distribution of the sentence length between machine-and human-translated sentences.(3) Significant difference in the distribution of the adverbial form and the attributive form of verbs and adjectives intimates that machine-translated sentences have more complex syntactic structure than human-translated ones do.(4) No significant difference exists in the distribution between verbs, adjectives and nouns.
A further investigation on verbs and nouns reveals what kind of technical challenges must be solved to improve the quality of machine translation up to the extent of human translation.

View full abstract

Download PDF (1916K)
Phrase Alignment for Example-Based Machine Translation

EIJI ARAMAKI, SADAO KUROHASHI, SATOSHI SATO, HIDEO WATANABE

2003Volume 10Issue 5 Pages 75-92
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_75

JOURNAL FREE ACCESS

Show abstractHide abstract

Example-based machine translation requires a large set of translation patterns. In this paper, we propose a phrase alignment method that aims to acquire translation patterns from bilingual sentense pairs. Most of previous methods employ word alingnment for phrase alingnment. This method uses the basic-phrase as the unit of phrase alingnment, and estimates alignment between basic-phrases. The experimental results show that this method performs well.

View full abstract

Download PDF (6954K)
Analysis of Evaluation Results and Features of Sentence Extraction on Different Corpora

CHIKASHI NOBATA, SATOSHI SEKINE, HITOSHI ISAHARA

2003Volume 10Issue 5 Pages 93-120
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_93

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we report evaluation results of our summarization system and analysis of summarization data in three different types of corpora. We have created our summarization system based on sentence extraction, and applied the system to summarization for Japanese and English newspaper articles and made excellent results. We have also created sentence extraction data from Japanese lecture speeches and evaluated our system to them. Besides evaluation results of our system, we showedanalysis of relationships between key sentences and features used in sentence extraction, and distributions of key sentences with a combination of features among these three different types of corpora.

View full abstract

Download PDF (2682K)
An Effective Matching to the Stored Q & A data using Initial Questions

KUNIO MATSUI, HOZUMI TANAKA

2003Volume 10Issue 5 Pages 121-138
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_121

JOURNAL FREE ACCESS

Show abstractHide abstract

As a part of customer service, there has been growing demand for call centers, whose function is to answer customer questions, such as usage inquiries for purchased products. In order to provide precise answers to the customer questions, frequently updated knowledge is required for newly developed products. The stressful nature of the work for the call center operators tend to discourage them to stay long with the center, and as a result, the companies are incurring increased personnel and training expenses for maintaining a group of highly skilled operators.
This paper describes the basic technology employed in our interactive navigation system, designed to allow users to solve their own problems without operator intervention, and thus minimizing the operator work at the call centers. The system is intended to guide the users to the required Q & A data expressed in natural language, stored within the call center database, where the natural language expressions or questions entered by the users are analyzed and used as the retrieval input. As a method for evaluating the importance in the task of query construction, of each term comprising the initial input question, we have developed a new method for altering the key-terms extracted from the set of initial questions matching the stored questions. We call this method the “success factor analysis method”. It has been shown through our experiments that the term limiting or decomposing the heads of the sentence greatly influences the search accuracy, and hence, that the actual matching accuracy is substantially improved by empirically determining the priority of the terms and supplementing the query with these terms.

View full abstract

Download PDF (6080K)
Evaluating a Method of Extracting Important Sentences using Distance between Entries in an Associative Concept Dictionary

JUN OKAMOTO, SHUN ISHIZAKI

2003Volume 10Issue 5 Pages 139-151
Published: October 10, 2003
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.10.5_139

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a method for calculating scores of importance for sentences for text summarization purposes. In this method, scores for sentences are calculated based on quantitative distance information in an associative concept dictionary, which includes about 160, 000 associated concepts. Eight articles are used to evaluate our method.The articles are chosen form Japanese elementary school textbooks because the dictionary was constructed using basic nouns in the textbooks. In order to evaluate the quality of the importance score ranking resulting from our method and other conventional methods using term frequency (tfidf), we carry out experiments where 40 human subjects chose the five most important sentences from each of the eight articles. The evaluation results show that sentences chosen by our method using association relationships is more comparable to those chosen by human subjects. The results show that summarization accuracy can be improved by applying our method.

View full abstract

Download PDF (1353K)

Register with J-STAGE for free!