Journal of Natural Language Processing

[title in Japanese]

[in Japanese]

2005Volume 12Issue 1 Pages 1-2
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.1

JOURNAL FREE ACCESS

Download PDF (276K)
Building a Large-Scale Japanese Grammar

TOMOYA NORO, TAIICHI HASHIMOTO, TAKENOBU TOKUNAGA, HOZUMI TANAKA

2005Volume 12Issue 1 Pages 3-32
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.3

JOURNAL FREE ACCESS

Show abstractHide abstract

Although large-scale grammars are prerequisite for parsing a great variety of sentences, it is difficult to build such grammars by hand.Yet, it is possible to derive a context-free grammar (CFG) automatically from an existing large-scale, syntactically annotated corpus.While seemingly a simple task, CFGs derived in such fashion have seldom been applied to existing systems.This is probably due to a great number of possible parse results (i.e.high ambiguity).In this paper, we analyze some causes of high ambiguity, and we propose a policy for building a large-scale Japanese CFG for syntactic parsing, capable of decreasing ambiguity.We also provide an experimental evaluation of the obtained CFG showing reduction in the number of parse results (reduced ambiguity) created by the CFG and the improved parsing accuracy.

View full abstract

Download PDF (4348K)
A Word Sense Disambiguation System Using Modified Naive Bayesian Algorithms for Indonesian Language

Mohammad Teduh Uliniansyaht, Shun Ishizaki

2005Volume 12Issue 1 Pages 33-50
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.33

JOURNAL FREE ACCESS

Show abstractHide abstract

It is common that a word in any natural language has often more than one meaning/sense. A word sense disambiguation (WSD) system is designed to determine which one of the senses of a polysemous word is invoked in a particular context around the word. We propose methods to disambiguate senses of polysemous words by using Naive Bayesian classifier method. A few sets of experiment data were taken from Kompas daily newspaper homepage and used for the system construction. We modified the original algorithm of Naive Bayesian method to apply it to the Indonesian language analysis. The experiments showed that our system achieved good accuracies (73-99%).

View full abstract

Download PDF (1791K)
Informative Spoken Language Summarization of the Diet Minutes

KAZUHIDE YAMAMOTO, YASUAKI ADACHI

2005Volume 12Issue 1 Pages 51-78
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.51

JOURNAL FREE ACCESS

Show abstractHide abstract

We present a method of summarizing the minutes of the national Diet.The minutes have some peculiar traits.For example, honorifics appear frequently and it includes both speech traits and document traits.In this paper, we focus attention on those traits, and paraphrase or delete specific expressions.We paraphrased honorifics that appear frequently in the minutes.Similarly, we presumed redundant parts using frequently-appeared expressions and several clue words, and deleted those parts.As a result of applying these processes to the minutes including spontaneous speech, we attains about 80 % summarization rate.We experimented to CSJ spoken language corpus using our system, result of about 84% summarization rate is obtained.These results tell us that the proposed approach works well not only for the minutes but for other spoken language expressions.

View full abstract

Download PDF (3003K)
Identifying Continuation and Response Relations between Utterances in Computer-Mediated Chat Dialogues

YASUHIRO TOKUNAGA, KENTARO INUI, YUJI MATSUMOTO

2005Volume 12Issue 1 Pages 79-105
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.79

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a computational model for analyzing the communicative structure of computer-mediated chat dialogues, reporting the present results of our empirical evaluation.We first formalize communicative structure underlying chat dialogues by decomposing it into continuation relations and response relations.A continuation relation holds between utterences of the same speaker that constitute a complete chunk functioning as a question response, etc.(e.g. the relation between the separate utterences Are and you a student?, which constitute a question).A response relations, on the other hand, holds between utterances, e.g.a question and its response, made by different speakers.Our model analyzes communicative structure by grouping utterances together according to these types of relations in a bottom-up fashion. For this process, we use corpus-based supervised machine learning. We manually annotated a chat dialogue corpus with communicative structure (two-person and three-person dialogues: 69 dialogues in total, containing 11, 905 utterance tokens).The automatic analyses matched the manual analyses in 87.4% for two-person dialogues and 84.6% for three.

View full abstract

Download PDF (2987K)
Analysis of Japanese relative clauses

TAKESHI ABEKAWA, MANABU OKUMURA

2005Volume 12Issue 1 Pages 107-123
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.107

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a new method of analyzing Japanese relative clauses. Japanese relative clause modification should be classified into at least two major semantic categories: case-slot gapping and head restrictive.In previous methods, only the information for judging a clause to be case-slot gapping, and cooccurrence information between nouns and verbs is taken into account.Our proposed method also takes into account the information for judging a clause to be head restrictive. From the result of experiments, we could find that it yielded higher accuracy than previous methods.

View full abstract

Download PDF (1864K)
Automatic acquisition of hyponymy relations from HTML documents

KEIJI SHINZATO, KENTARO TORISAWA

2005Volume 12Issue 1 Pages 125-150
Published: January 10, 2005
Released on J-STAGE: March 01, 2011

DOIhttps://doi.org/10.5715/jnlp.12.125

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes an automatic acquisition method for hyponymy relations.Hyponymy relations play a crucial role in various natural language processing systems, and there have been many attempts to automatically acquire the relations from largescale corpora.Most of the existing acquisition methods rely on particular linguistic patterns, such as juxtapositions, which specify hyponymy relations.Our method, however, does not use such linguistic patterns.We try to acquire hyponymy relations from four different types of clues.The first is repetitions of HTML tags found in usual HTML documents on the WWW.The second is statistical measures such as df and idf, which are popular in IR literatures.The third is verb-noun cooccurrences found in normal corpora.The fourth is heuristic rules obtained through our experiments on a development set.

View full abstract

Download PDF (2995K)

Register with J-STAGE for free!