Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 4, Issue 3
Displaying 1-6 of 6 articles from this issue
  • [in Japanese]
    1997 Volume 4 Issue 3 Pages 1-2
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Download PDF (209K)
  • the right place to choose between translation equivalents
    John D. Phillips
    1997 Volume 4 Issue 3 Pages 3-25
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper looks at the problem of choosing between alternative lexical translationequivalents in machine translation. It argues that the knowledge base used for makingchoices between alternative translations is part of the target language grammarand that it should be applied and the translation chosen as part of generation of thetarget language text. This contrasts with previous work on the problem, which hasassumed that these choices should be made in analysis or transfer. Various types ofknowledge base could be used to make the choice between alternatives. The methodoutlined here uses information about stereotypical contexts of use of words, storedas part of an ontological network.
    Download PDF (4536K)
  • MASAO UTIYAMA, SHUICHI ITAHASHI
    1997 Volume 4 Issue 3 Pages 27-50
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper proposes a method which disambiguates verb senses using co-occurrencebasedlikelihood parameters whose sample spaces are extended according to a thesaurus.The method selects the most plausible sense if its likelihood is significantlygreater than that of the second most plausible one.If not, the sample space is extendedand the significance test is tried again.If it cannot be extended anymore, thesystem gives up disambiguation.The method was applied to 74 polysemous verbs (about 89, 000 instances) extracted from the EDR Japanese Corpus.When the mostfrequent sense was selected, the precision was 0.65 and the applicability, i.e.the ratioof the disambiguated verbs to the treated verbs, was 1.00.The proposed methodwas compared with a class-based method.With Bunruigoihyou, the precisions ofboth the methods were 0.71, but the applicabilities of the proposed method andthe class-based method were 0.73 and 0.68, respectively.With the EDR Concept Classification Dictionary, the precisions of both the methods were 0.70, but the applicabilitiesof the proposed method and the class-based method were 0.87 and 0.76, respectively.The applicability of the proposed method is significantly higher thanthat of the class-based method, which shows the plausibility of the proposed method.
    Download PDF (2444K)
  • KOZO OI, EIICHIRO SUMITA, HITOSHI IIDA
    1997 Volume 4 Issue 3 Pages 51-70
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Retrieval based on semantic similarity between words (hereafter, similarity-basedretrieval) is one of the important problems in document retrieval technologies.Inprevious research on semantic similarity, measures of word-similarity using the thesauruswhose hierarhical structure is balanced, were used and thier effectiveness wereshown in applications such as language translation and document retrieval.This paperproposes a general measure of similarity which is applicable for both balancedand unbalanced thesauri.In this proposed measure, the lesser the number of conceptsunder the most specific common abstraction between concepts of words, thelarger the similarity between words.The authors have implemented a similaritybasedretrieval system using this semantic similarity and one of large-scale thesauri, EDR thesaurus.Moreover, in order to improve its accuracy, they have incorporatedword sense disambiguation method into the retrieval system.This retrieval systemis based on a conventional system, an extended boolean retrieval system using thephisical nearness between words and the weight of words.Through contrastive experimentswith the extended boolean system, the authors have shown the improvementin both recall and precision by the proposed similarity-based method.
    Download PDF (1730K)
  • KENJI KITA
    1997 Volume 4 Issue 3 Pages 71-82
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    This paper proposes a new method for automatically clustering languages.The basicidea of this method involves developing a probabilistic model for each languagefrom the given linguistic data, and then computing the distances between languagesaccording to the distance measure defined on the language models.Clustering isperformed based on this distance measure.The paper embodies this idea when the N-gram language model is concerned.The effectiveness of the proposed methodhas been confirmed by evaluation experiments using multilingual texts of nineteendifferent languages from the ECI Corpus (European Corpus Initiative Multilingual Corpus).The results were very encouraging.They were very close to the family treeof languages established in linguistics.
    Download PDF (1020K)
  • MASAHIRO OKU, KOJI MATSUOKA
    1997 Volume 4 Issue 3 Pages 83-99
    Published: July 10, 1997
    Released on J-STAGE: March 01, 2011
    JOURNAL FREE ACCESS
    Most Japanese texts are produced with Japanese word processors.As Japanese textsconsist of phonograms, KANA, and ideograms, KANJI, Japanese word processorsalways use KANA-KANJI conversion in which KANA sequences input through thekeyboard are converted into KANA-KANJI sequences.Therefore, Japanese textssuffer from homophone errors caused by erroneous KANA-KANJI conversion.Ahomophone error occurs when a KANA sequence is converted into the wrong wordwhich has the same reading.Detecting homophone errors is an important topic in Japanese text revision support systems.We have already proposed a high performancemethod for handling Japanese homophone errors in compound nouns usedin REVISE.The method, however, has some drawbacks.To compensate for thesedrawbacks, this paper describes a method for detecting Japanese homophone errorsin compound nouns that uses character cooccurrence.Character cooccurrence canbe easily collected from existing texts without any analysis.Therefore, this methodcan be used, in a Japanese revision support system, as a complementary method forhandling Japanese homophone errors in compound nouns.Moreover, as this methoddepends only on character cooccurrence, it can be applied not only to homophoneerrors but also other types of errors such as character deletion.
    Download PDF (1704K)
feedback
Top