Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 10, Issue 5
Displaying 1-9 of 9 articles from this issue
  • [in Japanese]
    2003 Volume 10 Issue 5 Pages 1-2
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (214K)
  • TOMOYOSI AKIBA, KATUNOBU ITOU
    2003 Volume 10 Issue 5 Pages 3-21
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper presents a method to reduce the size of the parsing table used in the LR parsing algorithm. The proposed method has the following significant characteristics; (1) that it can be applied along with any other methods for the parsing table reduction already known today, (2) that the parsing tables constructed by it can be used in the existing LR parser without modification, and (3) that it does not affect the parsing results and the parsing efficiency. We applied the method to construct the reduced LR table from some existing grammars used for NLP, and compared the produced LR tables with the tables constructed by the ordinary method. Our method showed that the produced tables had the sizes of between 60% and 25% of their original sizes according to the grammars.
    Download PDF (1767K)
  • KUMIKO OHMORI, HIROAKI SAITO
    2003 Volume 10 Issue 5 Pages 23-40
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper proposes a new dialogue control method with “presuppositional responses” to realize a large number of target words towards an efficient spoken dialogue interface. This strategy comes from human characteristics in that people tend to presuppose the utterance to be familiar or frequently-spoken. The strategy is verified through huge data of human recognition of 160, 000 sir names. We introduce heuristics to determine what words are to be presuppositional; presuppositional words should cover as many frequently-used ones as possible, while they should be small for high-accurate speech recognition. We report a successful implementation of a dialogue interface using a conventional speech recognition device. We resolve the situations when speech recognition fails or when the corrent answer is not included in presuppositional words in order not to irritate the user with unnecessary or detoured questions. Realtime and natural responses are attained through parallel search of non-frequent words as well as presuppositional ones.
    Download PDF (1976K)
  • KENJI KITA, MASAMI SHISHIBORI
    2003 Volume 10 Issue 5 Pages 41-54
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Nearest neighbor search in high dimensional spaces is an interesting and important problem which is relevant for a wide variety of applications, including multimedia information retrieval, data mining, and pattern recognition. For such applications, the curse of high dimensionality tends to be a major obstacle in the development of efficient indexing methods. This paper addresses the problem of designing an efficient multidimensional indexing structure for high dimensional nearest neighbor search. More specifically, using self-organizing maps (SOM), high-dimensional vector data are first transformed into one-dimensional units while preserving the higher order topology by mapping similar data items to the same or the neighboring unit. Then, given a query vector, only data items whose location is close to the unit location of the query are considered as candidates. Experimental results indicate that our scheme scales well even for a very large number of dimensions.
    Download PDF (2404K)
  • TAKEHIKO YOSHIMI
    2003 Volume 10 Issue 5 Pages 55-74
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper carries out a quantitative analysis of morpholexical difference between machine-translated Japanese sentences and human-translated ones, both of which are obtained from English sentences selected randomly from news articles. The analysis gives the following results.(1) A tendency to translate one English sentence into multiple Japanese sentences is less observed in machine translation than in human translation.(2) Significant difference exists in the distribution of the sentence length between machine-and human-translated sentences.(3) Significant difference in the distribution of the adverbial form and the attributive form of verbs and adjectives intimates that machine-translated sentences have more complex syntactic structure than human-translated ones do.(4) No significant difference exists in the distribution between verbs, adjectives and nouns.
    A further investigation on verbs and nouns reveals what kind of technical challenges must be solved to improve the quality of machine translation up to the extent of human translation.
    Download PDF (1916K)
  • EIJI ARAMAKI, SADAO KUROHASHI, SATOSHI SATO, HIDEO WATANABE
    2003 Volume 10 Issue 5 Pages 75-92
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Example-based machine translation requires a large set of translation patterns. In this paper, we propose a phrase alignment method that aims to acquire translation patterns from bilingual sentense pairs. Most of previous methods employ word alingnment for phrase alingnment. This method uses the basic-phrase as the unit of phrase alingnment, and estimates alignment between basic-phrases. The experimental results show that this method performs well.
    Download PDF (6954K)
  • CHIKASHI NOBATA, SATOSHI SEKINE, HITOSHI ISAHARA
    2003 Volume 10 Issue 5 Pages 93-120
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper, we report evaluation results of our summarization system and analysis of summarization data in three different types of corpora. We have created our summarization system based on sentence extraction, and applied the system to summarization for Japanese and English newspaper articles and made excellent results. We have also created sentence extraction data from Japanese lecture speeches and evaluated our system to them. Besides evaluation results of our system, we showedanalysis of relationships between key sentences and features used in sentence extraction, and distributions of key sentences with a combination of features among these three different types of corpora.
    Download PDF (2682K)
  • KUNIO MATSUI, HOZUMI TANAKA
    2003 Volume 10 Issue 5 Pages 121-138
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    As a part of customer service, there has been growing demand for call centers, whose function is to answer customer questions, such as usage inquiries for purchased products. In order to provide precise answers to the customer questions, frequently updated knowledge is required for newly developed products. The stressful nature of the work for the call center operators tend to discourage them to stay long with the center, and as a result, the companies are incurring increased personnel and training expenses for maintaining a group of highly skilled operators.
    This paper describes the basic technology employed in our interactive navigation system, designed to allow users to solve their own problems without operator intervention, and thus minimizing the operator work at the call centers. The system is intended to guide the users to the required Q & A data expressed in natural language, stored within the call center database, where the natural language expressions or questions entered by the users are analyzed and used as the retrieval input. As a method for evaluating the importance in the task of query construction, of each term comprising the initial input question, we have developed a new method for altering the key-terms extracted from the set of initial questions matching the stored questions. We call this method the “success factor analysis method”. It has been shown through our experiments that the term limiting or decomposing the heads of the sentence greatly influences the search accuracy, and hence, that the actual matching accuracy is substantially improved by empirically determining the priority of the terms and supplementing the query with these terms.
    Download PDF (6080K)
  • JUN OKAMOTO, SHUN ISHIZAKI
    2003 Volume 10 Issue 5 Pages 139-151
    Published: October 10, 2003
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper, we propose a method for calculating scores of importance for sentences for text summarization purposes. In this method, scores for sentences are calculated based on quantitative distance information in an associative concept dictionary, which includes about 160, 000 associated concepts. Eight articles are used to evaluate our method.The articles are chosen form Japanese elementary school textbooks because the dictionary was constructed using basic nouns in the textbooks. In order to evaluate the quality of the importance score ranking resulting from our method and other conventional methods using term frequency (tfidf), we carry out experiments where 40 human subjects chose the five most important sentences from each of the eight articles. The evaluation results show that sentences chosen by our method using association relationships is more comparable to those chosen by human subjects. The results show that summarization accuracy can be improved by applying our method.
    Download PDF (1353K)
feedback
Top