Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 12, Issue 3
Displaying 1-11 of 11 articles from this issue
  • [in Japanese]
    2005 Volume 12 Issue 3 Pages 1-2
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (235K)
  • KAZUYA SHITAOKA, KIYOTAKA UCHIMOTO, TATSUYA KAWAHARA, HITOSHI ISAHARA
    2005 Volume 12 Issue 3 Pages 3-17
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper describes methods to detect dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus.In spontaneous monologues, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous.In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Japanese speech: One is based on statistical machine translation using dependency information and the other is based on text chunking using SVM.An F-measure of 84.9 was achieved for the accuracy of sentence boundary detection by using the proposed methods. The accuracy of dependency structure analysis was also improved from 75.2% to 77.2% by using automatically detected sentence boundaries.Furthermore, the accuracy of dependency structure analysis and that of sentence boundary detection were improved by interactively using the counterpart results.
    Download PDF (3019K)
  • KAZUHIDE YAMAMOTO, KAZUTERU OHASHI
    2005 Volume 12 Issue 3 Pages 19-42
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    We discuss and propose a paraphrasing method of Japanese verbal noun phrases into corresponding Japanese compound nouns.This work is done as a basic research of understanding mechanism of Japanese paraphrasing phenomena, as well as contributing processes of summarization for mobile devices, proofreading of official documents, and approximate string matching in information retrieval.The paraphrasing process involves a judgment of compound noun that is created as a candidate of paraphrasing, in terms of their naturalness.We first discuss this issue and describe a criterion to determine if it is a proper output.We then need to transform a case element of a verb into a modifier of noun, since verbness of the verbal noun is reduced by paraphrasing. We illustrate the process of this transformation in the paper, and evaluate the correctness of this process with the results of our experiments.We finally discuss two roles of verbal noun, i.e.verb and noun, and change of verbness in context.
    Download PDF (2257K)
  • KAZUTAKA SHIMADA, KOJI HAYASHI, TSUTOMU ENDO
    2005 Volume 12 Issue 3 Pages 43-66
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Tables are an efficient way to express relational information. Most of information about products is written in tabular form. Table (specification) extraction is a significant task to handle product information written in tabular form such as specifications. We are developing a multi-specifications summarization system. The specifications are written in ‹TABLE› tags. The presence of the ‹TABLE› tags in an HTML document does not necessarily indicate the presence of specifications. Less than 30% of HTML ‹TABLE› tags are real tables in one particular domain. In this paper, we propose a method for specification extraction using SVMs. To reduce the training data, we also evaluate this task by using transductive SVMs. For PC, digital still camera and printer specifications, we evaluate the performance of SVMs and transductive SVMs. Experimental results show the effectiveness of our methods.
    Download PDF (12066K)
  • SANAE FUJITA, FRANCIS BOND
    2005 Volume 12 Issue 3 Pages 67-89
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    In this paper we investigate the properties of Japanese and English transitiveintransitive alternations.For Japanese alternations, we show that the selectional restrictions of alternating arguments are more similar than those for non-alternating arguments.Across languages we show that there are four major strategies for translating alternating verbs.Finally, we present a method that uses alternation data to add new entries to an existing bilingual valency lexicon.If the existing lexicon has only one half of the alternation, then our method constructs the other half.The new entries have detailed information about argument structure and selectional restrictions. In this paper we focus on one class of alternations, but our method is applicable to any alternation.We were able to increase the coverage of the causative alternation to 85.4%, and the new entries gave an overall improvement in translation quality of 32%.
    Download PDF (3921K)
  • SHUN SHIRAMATSU, TAKASHI MIYAMA, HIROSHI G. OKUNO, KÔITI HASIDA
    2005 Volume 12 Issue 3 Pages 91-109
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Centering theory is to explain relations among focus, anaphora, and cohesion.However, it fails to address any general principle behind anaphora.Moreover, although the salience of discourse entities plays a critical role in centering theory, it is not defined as an objectively measurable quantity.On the other hand, Hasida et a1.(1995, 1996) propose meaning game as a model of intentional communication, and claim that it derives centering theory, but this claim has not yet been verified on the basis of substantial linguistic data.In this paper, we formulate salience in terms of reference probability (as measurable quantity).Under this formulation, meaning game derives preferences subsuming two rules of centering theory.Those preferences, entailing stronger predictions than centering theory, are verified based on a Japanese corpus. Meaning game is hence a better working hypothesis than the centering theory in terms of both theoretical clarity and predictive power.Domain-specific accounts such as centering theory are probably not necessary to explain anaphora, focus, and so on.
    Download PDF (1925K)
  • AKIHIRO SHINMORI, MANABU OKUMURA
    2005 Volume 12 Issue 3 Pages 111-128
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Patent specifications consist of patent claims and detailed descriptions.While patent claims are the most important part of patent specifications, they are compositionally or combinationally described and difficult to read.By aligning patent claims with the “detailed description”, (1) the functions and the effects of the claim can be clarified, (2) the important elements in the claims can be identified, or (3) paraphrases for the expressions in the claim can be obtained.In this paper, we propose a method to align patent claims with the “detailed description” by analyzing the structure of claims to get core elements of claims and by doing local alignments starting from word blocks including declinable words.By using 88 patent specifications out of 100 which were randomly picked up from the NTCIR3 patent data collection, the effectiveness of the method is demonstrated.
    Download PDF (1875K)
  • RYOHEI SASANO, DAISUKE KAWAHARA, SADAO KUROHASHI
    2005 Volume 12 Issue 3 Pages 129-144
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper proposes a method to automatically construct Japanese nominal case frames.The point of our method is the integrated use of a dictionary and example phrases from large corpora.To examine the practical usefulness of the constructed nominal case frames, we built a system of indirect anaphora resolution based on the constructed case frames.The case frames were evaluated by hand, and were confirmed to be good quality.Experimental results of indirect anaphora resolution also indicated the effectiveness of our approach.
    Download PDF (1678K)
  • NGUYEN MY CHAU, TAKASHI IKEDA
    2005 Volume 12 Issue 3 Pages 145-182
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper is about the machine translation system from Japanese to Vietnamese. So far, there has neither Japanese-Vietnamese machine translation system in the MT software market, nor any research for Japanese to Vietnamese machine translation. This paper aims at being the first step to overcome this situation.Japanese is an agglutinative language with SOV structure and Vietnamese is an isolated language with SVO structure.This produces big difference between Japanese and Vietnamese expression structures.In this paper we focused on the difference between Japanese adnominal embedding structure and its corresponding expressions in Vietnamese. We analyzed the lexical and syntactical relationship between the two languages and proposed machine translation rules for Japanese adnominal embedding structures. We evaluated our rules manually on 714 Japanese embedding sentences.The accuracy was around 87% (however, when applying the rules, we assumed that all the necessary information had been properly analyzed, although partly of the rules are difficult to be implemented automatically at the present moment).The proposed rules are going to be implemented into machine translation system jaw/Vietnamese which is now being developed in our laboratory.
    Download PDF (3254K)
  • KENTARO OGURA, YOSHIHIKO HAYASHI, SAEKO NOMURA, TORU ISHIDA
    2005 Volume 12 Issue 3 Pages 183-201
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper analyzes the impact of user adaptation in MT-mediated communication. It clarifies how the user adapts to machine translation and how effective the adaptation is in terms of communication when the purpose of communication is clear.The most common alterations and their effectiveness strongly depend on the translation language pairs.In the case of Japanese-to-English translation, we observed two main alterations: replacing words or phrases to offset the difference in concepts between Japanese and English and supplementing subjects to offset the difference in modes of expression between Japanese and English.Since Korean and Japanese are similar languages, Korean users exhibited similar adaptation tendencies.The adaptation performed by Japanese users when referring to the English translation was very effective in improving the quality of the English translations.However, it was not so effective for Chinese and even less effective for Korean translations.
    Download PDF (3517K)
  • NOZOMI KOBAYASHI, KENTARO INUI, YUJI MATSUMOTO, KENJI TATEISHI, TOSHIK ...
    2005 Volume 12 Issue 3 Pages 203-222
    Published: July 10, 2005
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Increasing interest is recently observed in the method of extracting human opinions from a large scale of heterogeneous text data such as Web documents.To automate the process of opinion extraction, having a collection of evaluative expressions such as “the seats are comfortable” would be useful.However, it can be prohibitively costly to manually create an exhaustive list of such expressions for many domains, because they tend to be domain-dependent.Motivated by this background, we have been exploring the way to accelerate the process of collecting evaluative expressions by applying a text mining technique.This paper proposes a semi-automatic method that uses particular cooccurrence patterns of evaluated subjects, focused attributes and values.Experimental results show its efficiency compared to manual collection of those expressions.
    Download PDF (5234K)
feedback
Top