Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2005Volume 12Issue 3 Pages 1-2
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_1

JOURNAL FREE ACCESS

Download PDF (235K)
Interaction between Dependency Structure Analysis and Sentence Boundary Detection in Spontaneous Japanese

KAZUYA SHITAOKA, KIYOTAKA UCHIMOTO, TATSUYA KAWAHARA, HITOSHI ISAHARA

2005Volume 12Issue 3 Pages 3-17
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_3

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes methods to detect dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus.In spontaneous monologues, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous.In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Japanese speech: One is based on statistical machine translation using dependency information and the other is based on text chunking using SVM.An F-measure of 84.9 was achieved for the accuracy of sentence boundary detection by using the proposed methods. The accuracy of dependency structure analysis was also improved from 75.2% to 77.2% by using automatically detected sentence boundaries.Furthermore, the accuracy of dependency structure analysis and that of sentence boundary detection were improved by interactively using the counterpart results.

View full abstract

Download PDF (3019K)
Paraphrasing Verbal Noun Phrases into Compound Nouns

KAZUHIDE YAMAMOTO, KAZUTERU OHASHI

2005Volume 12Issue 3 Pages 19-42
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_19

JOURNAL FREE ACCESS

Show abstractHide abstract

We discuss and propose a paraphrasing method of Japanese verbal noun phrases into corresponding Japanese compound nouns.This work is done as a basic research of understanding mechanism of Japanese paraphrasing phenomena, as well as contributing processes of summarization for mobile devices, proofreading of official documents, and approximate string matching in information retrieval.The paraphrasing process involves a judgment of compound noun that is created as a candidate of paraphrasing, in terms of their naturalness.We first discuss this issue and describe a criterion to determine if it is a proper output.We then need to transform a case element of a verb into a modifier of noun, since verbness of the verbal noun is reduced by paraphrasing. We illustrate the process of this transformation in the paper, and evaluate the correctness of this process with the results of our experiments.We finally discuss two roles of verbal noun, i.e.verb and noun, and change of verbness in context.

View full abstract

Download PDF (2257K)
Product Specification Extraction Using SVM and Transductive SVM

KAZUTAKA SHIMADA, KOJI HAYASHI, TSUTOMU ENDO

2005Volume 12Issue 3 Pages 43-66
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_43

JOURNAL FREE ACCESS

Show abstractHide abstract

Tables are an efficient way to express relational information. Most of information about products is written in tabular form. Table (specification) extraction is a significant task to handle product information written in tabular form such as specifications. We are developing a multi-specifications summarization system. The specifications are written in ‹TABLE› tags. The presence of the ‹TABLE› tags in an HTML document does not necessarily indicate the presence of specifications. Less than 30% of HTML ‹TABLE› tags are real tables in one particular domain. In this paper, we propose a method for specification extraction using SVMs. To reduce the training data, we also evaluate this task by using transductive SVMs. For PC, digital still camera and printer specifications, we evaluate the performance of SVMs and transductive SVMs. Experimental results show the effectiveness of our methods.

View full abstract

Download PDF (12066K)
An Investigation into the Nature of Verbal Alternations and their Use in the Creation of Bilingual Valency Entries

SANAE FUJITA, FRANCIS BOND

2005Volume 12Issue 3 Pages 67-89
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_67

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper we investigate the properties of Japanese and English transitiveintransitive alternations.For Japanese alternations, we show that the selectional restrictions of alternating arguments are more similar than those for non-alternating arguments.Across languages we show that there are four major strategies for translating alternating verbs.Finally, we present a method that uses alternation data to add new entries to an existing bilingual valency lexicon.If the existing lexicon has only one half of the alternation, then our method constructs the other half.The new entries have detailed information about argument structure and selectional restrictions. In this paper we focus on one class of alternations, but our method is applicable to any alternation.We were able to increase the coverage of the causative alternation to 85.4%, and the new entries gave an overall improvement in translation quality of 32%.

View full abstract

Download PDF (3921K)
Dissolution of Centering Theory Based on Game Theory and Its Empirical Verification

SHUN SHIRAMATSU, TAKASHI MIYAMA, HIROSHI G. OKUNO, KÔITI HASIDA

2005Volume 12Issue 3 Pages 91-109
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_91

JOURNAL FREE ACCESS

Show abstractHide abstract

Centering theory is to explain relations among focus, anaphora, and cohesion.However, it fails to address any general principle behind anaphora.Moreover, although the salience of discourse entities plays a critical role in centering theory, it is not defined as an objectively measurable quantity.On the other hand, Hasida et a1.(1995, 1996) propose meaning game as a model of intentional communication, and claim that it derives centering theory, but this claim has not yet been verified on the basis of substantial linguistic data.In this paper, we formulate salience in terms of reference probability (as measurable quantity).Under this formulation, meaning game derives preferences subsuming two rules of centering theory.Those preferences, entailing stronger predictions than centering theory, are verified based on a Japanese corpus. Meaning game is hence a better working hypothesis than the centering theory in terms of both theoretical clarity and predictive power.Domain-specific accounts such as centering theory are probably not necessary to explain anaphora, focus, and so on.

View full abstract

Download PDF (1925K)
Aligning Patent Claims with the “Detailed Description” for Readability

AKIHIRO SHINMORI, MANABU OKUMURA

2005Volume 12Issue 3 Pages 111-128
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_111

JOURNAL FREE ACCESS

Show abstractHide abstract

Patent specifications consist of patent claims and detailed descriptions.While patent claims are the most important part of patent specifications, they are compositionally or combinationally described and difficult to read.By aligning patent claims with the “detailed description”, (1) the functions and the effects of the claim can be clarified, (2) the important elements in the claims can be identified, or (3) paraphrases for the expressions in the claim can be obtained.In this paper, we propose a method to align patent claims with the “detailed description” by analyzing the structure of claims to get core elements of claims and by doing local alignments starting from word blocks including declinable words.By using 88 patent specifications out of 100 which were randomly picked up from the NTCIR3 patent data collection, the effectiveness of the method is demonstrated.

View full abstract

Download PDF (1875K)
Automatic Construction of Nominal Case Frames and its Application to Indirect Anaphora Resolution

RYOHEI SASANO, DAISUKE KAWAHARA, SADAO KUROHASHI

2005Volume 12Issue 3 Pages 129-144
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_129

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a method to automatically construct Japanese nominal case frames.The point of our method is the integrated use of a dictionary and example phrases from large corpora.To examine the practical usefulness of the constructed nominal case frames, we built a system of indirect anaphora resolution based on the constructed case frames.The case frames were evaluated by hand, and were confirmed to be good quality.Experimental results of indirect anaphora resolution also indicated the effectiveness of our approach.

View full abstract

Download PDF (1678K)
Translation of Adnominal Modification Structures in Japanese-Vietnamese Machine Translation

NGUYEN MY CHAU, TAKASHI IKEDA

2005Volume 12Issue 3 Pages 145-182
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_145

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper is about the machine translation system from Japanese to Vietnamese. So far, there has neither Japanese-Vietnamese machine translation system in the MT software market, nor any research for Japanese to Vietnamese machine translation. This paper aims at being the first step to overcome this situation.Japanese is an agglutinative language with SOV structure and Vietnamese is an isolated language with SVO structure.This produces big difference between Japanese and Vietnamese expression structures.In this paper we focused on the difference between Japanese adnominal embedding structure and its corresponding expressions in Vietnamese. We analyzed the lexical and syntactical relationship between the two languages and proposed machine translation rules for Japanese adnominal embedding structures. We evaluated our rules manually on 714 Japanese embedding sentences.The accuracy was around 87% (however, when applying the rules, we assumed that all the necessary information had been properly analyzed, although partly of the rules are difficult to be implemented automatically at the present moment).The proposed rules are going to be implemented into machine translation system jaw/Vietnamese which is now being developed in our laboratory.

View full abstract

Download PDF (3254K)
Language-Dependency in User's Adaptation for MT Systems in MT-mediated Communication

KENTARO OGURA, YOSHIHIKO HAYASHI, SAEKO NOMURA, TORU ISHIDA

2005Volume 12Issue 3 Pages 183-201
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_183

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper analyzes the impact of user adaptation in MT-mediated communication. It clarifies how the user adapts to machine translation and how effective the adaptation is in terms of communication when the purpose of communication is clear.The most common alterations and their effectiveness strongly depend on the translation language pairs.In the case of Japanese-to-English translation, we observed two main alterations: replacing words or phrases to offset the difference in concepts between Japanese and English and supplementing subjects to offset the difference in modes of expression between Japanese and English.Since Korean and Japanese are similar languages, Korean users exhibited similar adaptation tendencies.The adaptation performed by Japanese users when referring to the English translation was very effective in improving the quality of the English translations.However, it was not so effective for Chinese and even less effective for Korean translations.

View full abstract

Download PDF (3517K)
Collecting Evaluative Expressions for Opinion Extraction

NOZOMI KOBAYASHI, KENTARO INUI, YUJI MATSUMOTO, KENJI TATEISHI, TOSHIK ...

2005Volume 12Issue 3 Pages 203-222
Published: July 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3_203

JOURNAL FREE ACCESS

Show abstractHide abstract

Increasing interest is recently observed in the method of extracting human opinions from a large scale of heterogeneous text data such as Web documents.To automate the process of opinion extraction, having a collection of evaluative expressions such as “the seats are comfortable” would be useful.However, it can be prohibitively costly to manually create an exhaustive list of such expressions for many domains, because they tend to be domain-dependent.Motivated by this background, we have been exploring the way to accelerate the process of collecting evaluative expressions by applying a text mining technique.This paper proposes a semi-automatic method that uses particular cooccurrence patterns of evaluated subjects, focused attributes and values.Experimental results show its efficiency compared to manual collection of those expressions.

View full abstract

Download PDF (5234K)

Register with J-STAGE for free!