Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 11, Issue 2
Displaying 1-6 of 6 articles from this issue
  • [in Japanese]
    2004 Volume 11 Issue 2 Pages 1-2
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (303K)
  • Jianmin Yao, Hao Yu, Tiejun Zhao, Sheng Li
    2004 Volume 11 Issue 2 Pages 3-20
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    To help developing a localization oriented example based machine translation (EBMT) system, an automatic machine translation evaluation method is implemented which adopts edit similarity, cosine correlation and Dice coefficient as criteria. Experiment shows that the evaluation method distinguishes well between translations of different intelligibility and fluency. The similarity between Dice coefficient and cosine are analyzed mathematically and observed in the experiments. To verify theconsistency between automatic and human evaluation methods, six machine translation systems are scored using both human and automatic methods. The evaluation results are compared which show consistency between different evaluation methods. Statistical analysis is made to validate the experimental results. Correlation coefficient and significance tests at 99%level are made to ensure the reliability of the results. Linear regression equations are built to map the automatic scoring results to human scorings. The regression equation is utilized to predict human scoring of machine translation systems. The prediction result is promising. Experimental results show that the proposed MT evaluation method is applicable to general MT systems and EBMT as well.
    Download PDF (2438K)
  • HIROKO OHTSUKA, MASAO UTIYAMA, HITOSHI ISAHARA
    2004 Volume 11 Issue 2 Pages 21-66
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    We aim to extract intentions from open-ended questionnaires. Intentions includerequest, complaint, resignation and so forth. We focus on extracting request intentionsthis time. To extract intentions, we first have to judge reliably whether a givenresponse does contain a request intention or not. Therefore, as a first step, we havedeveloped a criterion for judging the existence of request intentions in responses. The criterion, which is based on paraphrasing, is described in detail in this paper. Our assumption is that a response with request intentions can be paraphrased intoa typical request expression, e. g., “I would like to…, ” while responses withoutrequest are not paraphrasable. The criterion is evaluated in terms of objectivity, reproducibility and effectiveness. Objectivity is demonstrated by showing that machinelearning methods can learn the criterion from a set of intention-tagged data, while reproducibility, that the judgments of three annotators are reasonably consistent, and effectiveness, that judgments based not on the criterion but on intuitiondo not agree. This means the criterion is necessary to achieve reproducibility. Theseexperiments indicate that the criterion can be used to judge the existence of requestintentions in responses reliably.
    Download PDF (7419K)
  • KAZUYA SHITAOKA, HIROAKI NANJO, TATSUYA KAWAHARA
    2004 Volume 11 Issue 2 Pages 67-83
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Transcriptions and speech recognition results of lectures include many expressions peculiar to spoken language. Thus, it is necessary to transform them into document style for practical use of them. We apply the statistical approach used by machine translation to automatic transformation of the spoken language into document style sentences. We deal with deletion of fillers, insertion of periods, insertion of particles, conversion to written expressions and unification of the end-of-sectence style. A beam search is introduced to apply these processings in an integrated manner. Experimental evaluation using real lecture transcriptions comfirms that the statistical transformation framework works well and we achieved high recall and precision rates of period and particle insertion.
    Download PDF (2998K)
  • KENJI IMAMURA, EIICHIRO SUMITA, YUJI MATSUMOTO
    2004 Volume 11 Issue 2 Pages 85-99
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    When machine translation (MT) knowledge is automatically constructed from bilingual corpora, redundant rules are acquired due to translation variety. These rules increase ambiguity or cause incorrect MT results. To overcome this problem, we constrain the sentences used for knowledge extraction to “the appropriate bilingual sentences for the MT.” In this paper, we propose a method using translation literalness to select appropriate sentences or phrases. The translation correspondence rate (TCR) is defined as the literalness measure.
    Based on the TCR, two automatic construction methods are tested. One is to filter the corpus before rule acquisition. The other is to split the acquisition process into two phases, where a bilingual sentence is divided into literal parts and the other parts before different generalizations are applied. The effects are evaluated by the MT quality, and about 8.6% of MT results were improved by the latter method.
    Download PDF (1517K)
  • TAKEHIKO YOSHIMI
    2004 Volume 11 Issue 2 Pages 101-113
    Published: April 10, 2004
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    As part of an attempt of revealing what kind of technical challenges must be solved to improve the quality of machine translation up to the extent of human translation, this paper carries out a quantitative analysis of distribution of familiarity rating of verbs between machine-translated Japanese sentences and human-translated ones, both of which are obtained from English sentences selected randomly from news articles. The familiarity rating is measured based on the database of familiarity rating developed at NTT Communication Science Laboratories.The analysis found that no significant difference exists in the distribution of familiarity rating of verbs between machine-and human-translated sentences. This intimates that as far as concerning the translation of verbs alone, the quality of the investigated MT system has reached a fixed standard.
    Download PDF (1254K)
feedback
Top