Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2004Volume 11Issue 5 Pages 1-2
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_1

JOURNAL FREE ACCESS

Download PDF (185K)
Synonymous Sentences Grouping with Multilingual Parallel Corpus

HIDEKI KASHIOKA

2004Volume 11Issue 5 Pages 3-18
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_3

JOURNAL FREE ACCESS

Show abstractHide abstract

Recently, natural language processing researches pay attention to the data or processing technique for paraphrase. Unfortunately, we have not many data for paraphrase. There are some research reports with collecting the synonymous expression with parallel corpus. However, suitable corpus for collecting the set of paraphrase is not available. Then, we get a few variations of expression in the paraphrase set when we tried in this method with parallel corpus. In this paper, we proposed the grouping method based on the basic idea as grouping the synonymous sentences related with the translation recursively and decomposed the wrong group using DMdecomposition algorithm. The wrong groups are included the expression that cannot be paraphrase caused some words or expressions have different meanings in different situations. We discuss our method and experimental result with BTEC that is multilingual parallel corpus.

View full abstract

Download PDF (9544K)
Paraphrasing Predicates from Written Language Specific Vocabulary into Spoken Language Vocabulary Using the World Wide Web

NOBUHIRO KAJI, MASASHI OKAMOTO, SADAO KUROHASHI

2004Volume 11Issue 5 Pages 19-37
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_19

JOURNAL FREE ACCESS

Show abstractHide abstract

There are a lot of differences between expressions used in written language and spoken language. This paper represents a method of paraphrasing written language specific vocabulary into spoken language vocabulary. They can be distinguished based on the occurrence probability in written and spoken language corpora which are automatically collected from WWW. Experimental results indicated the effectiveness of our method.The precision of the collected corpora was 94%, and the accuracy of learning paraphrases was 79%.

View full abstract

Download PDF (3885K)
Expansion of a Japanese-Uighur Bilingual Dictionary by Paraphrasing

YASUHIRO OGAWA, SATOSHI KAMATANI, MUHTAR MAHSUT, YASUYOSHI INAGAKI

2004Volume 11Issue 5 Pages 39-61
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_39

JOURNAL FREE ACCESS

Show abstractHide abstract

In machine translation, the number of words in a bilingual dictionary has an important influence on the translation. However, the development cost of such a dictionary is very expensive. In this paper, we resolve this problem by paraphrasing a non-entry word into the entry words. We divide the paraphrasing process into two steps: collecting and screening. In the collecting step, we make paraphrasing expressions of an original word by using its lexical descriptions in a Japanese monolingual dictionary. In the following screening step, we calculate the similarity between the original word and each of its paraphrasing expressions, and choose the best one. We applied this method to our Japanese-Uighur bilingual dictionary. As a result, for 68.3% of non-entry words, the appropriate Uighur words were given.

View full abstract

Download PDF (2809K)
Interaction between Paraphraser and Transfer for Spoken Language Translation

Kazuhide Yamamoto

2004Volume 11Issue 5 Pages 63-86
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_63

JOURNAL FREE ACCESS

Show abstractHide abstract

One of the problems in spoken language translation is the enormous variety of expressions not found in text translation. This volume can lead to a sparse translation coverage. In order to tackle this problem, we propose a machine translation model where an input is translated through both source-language and target-language paraphrasing processes. In this paper, we discuss the source paraphrasing and the language transfer processes, and the design of our translation model. In the source language paraphrasing, we take the practical approach of untangling slight variations in the source language before transferring a source expression to its target. We discuss how effective our paraphrasing process is in the sense of reducing varieties in a spoken language, with a focus on how many source language patterns are reduced by paraphrasing. In the translation model, we propose an interaction model between the source language paraphraser and the transfer, unlike the conventional assembly-line process flow. In our evaluation we illustrate that over 70% of the input utterances is expected to somehow be changed. Accordingly, we can achieve that one-fifth of all skeleton expressions can be merged into other skeletons, that increases chances of correct translations being obtained. Furthermore, we observe that our interaction model with the paraphraser increases 20-40 percentage points of translation capability, regardless of the transfer knowledge size.

View full abstract

Download PDF (4909K)
Paraphrasing as Machine Translation

Andrew Finch, Taro Watanabe, Yasuhiro Akiba, Eiichiro Sumita

2004Volume 11Issue 5 Pages 87-111
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_87

JOURNAL FREE ACCESS

Show abstractHide abstract

This article presents two statistically-based methods of automatically generating paraphrases for sentences; one based on direct statistical machine translation, the other based on data-oriented techniques. These paraphrasers are evaluated by human judges, and compared to both human paraphrases and those generated by a simple baseline model. The data-oriented approach proved to be the most successful in this evaluation and a second experiment was conducted to determine the usefulness of machine-generated paraphrases when used to expand the reference set used for machine translation evaluation. Varying numbers of synthetic paraphrases were mixed with varying numbers of real references to determine the circumstances under which the addition of synthetic paraphrases might be useful. Nine different machine translation systems were evaluated in this study using scores from nine human judges. Three machine translation evaluation schemes were used to perform the machine translation evaluation: BLEU, NIST and mWER. The results show that the usefulness of the synthetic paraphrases depends on which of the machine translation evaluation methods is used. The paraphrases degraded the NIST performance, but improved the evaluation performance of both BLEU and mWER.

View full abstract

Download PDF (2356K)
Universal Model for Paraphrasing

Using Transformation Based on a Defined Criteria

MASAKI MURATA, HITOSHI ISAHARA

2004Volume 11Issue 5 Pages 113-133
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_113

JOURNAL FREE ACCESS

Show abstractHide abstract

Studies on paraphrasing are important in various research topics such as sentence generation, summarization, and question-answering. A universal model is described for paraphrasing that transforms according to defined criteria. We show that by using different criteria, we can construct different kinds of paraphrasing systems including one for compressing sentences, one for polishing the sentences up, one for transforming written language into spoken language, one for transforming English words into synonyms with the same meaing containing less “l” and “r” letters, and one for answering questions. Our model efficiently constructs systems and produces dynamic paraphrasing systems. It should prompt the creation of new paraphrasing systems in the feature.

View full abstract

Download PDF (2391K)
Automatic Paraphrase Acquisition Based on Matching of Definition Sentences in Plural Dictionaries

MASAKI MURATA, TOSHIYUKI KANAMARU, HITOSHI ISAHARA

2004Volume 11Issue 5 Pages 135-149
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_135

JOURNAL FREE ACCESS

Show abstractHide abstract

Studies on paraphrasing are important in various research topics such as sentence generation, summarization, and question-answering. Extracting automatic paraphrases by matching definitions of the same word in two dictionaries is described. A new method for extracting these paraphrases is also described. Higher precision was obtained than with the conventional method of using frequency. Our method can be applied to other studies on paraphrase extraction. The method obtained the precision rate of 0.748in the top 500data and that of 0.222in the 500data that were extracted randomly, when a synonym only was judged as a correct answer. It obtained the precision rate of 0.954in the top 500data and that of 0.722in the 500 data that were extracted randomly, when a hypernym and a similar expression were also judged as correct answers.

View full abstract

Download PDF (1704K)
A Survey on Paraphrase Generation and Recognition

KENTARO INUI, ATSUSHI FUJITA

2004Volume 11Issue 5 Pages 151-198
Published: October 10, 2004
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.11.5_151

JOURNAL FREE ACCESS

Show abstractHide abstract

Paraphrases are alternative ways of conveying the same content. The language technology for processing paraphrases, namely, paraphrase generation and paraphrase recognition, has drawn the attention of an increasing number of researchers because of its potential contribution to a wide variety of natural language applications. This survey paper overviews recent research trends in paraphrase generation and recognition, and discusses future prospects, addressing the issues of the definition of paraphrases, transformation-based paraphrase generation, paraphrase recognition in question answering and multi-document summarization, and finally corpus-based knowledge acquisition.

View full abstract

Download PDF (11083K)

Register with J-STAGE for free!