自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
2 巻 , 2 号
選択された号の論文の5件中1~5を表示しています
  • 榑松 明
    1995 年 2 巻 2 号 p. 1
    発行日: 1995/04/10
    公開日: 2011/03/01
    ジャーナル フリー
  • 滝澤 修
    1995 年 2 巻 2 号 p. 3-22
    発行日: 1995/04/10
    公開日: 2011/03/01
    ジャーナル フリー
    比喩の一種である「駄洒落」は, 言語記号 (音声) とその記号が表す概念の意味との両方に, 比喩を成立させる「根拠 (ground) 」 (比喩における被喩辞 (tenor) と喩辞 (vehicle) とを結びつける関係) があるという点で, 高度な修辞表現に位置づけられる. 筆者らは, 「併置型」と呼ぶ駄洒落の一種 (例「トイレに行っといれ」) を, 外国語専攻の大学生54名に筆記によって創作させ, 203個を収集した. そしてこのデータに対して, 駄洒落理解システムの構築に必要な知見を得るという観点から, 「先行喩辞」 (例では「トイレ」) と「後続喩辞」 (例では「…といれ」) の関係, 及び「出現喩辞」 (例では「…といれ」) と「復元喩辞」 (例では「…ておいで」) の関係に着目し, 以下の3つの分析を行った.(1) 先行-後続出現喩辞間の音素列は, どれ位の長さの一致が見られるか.(2) 先行-後続出現喩辞間の音素の相違にはどのような特徴があるか.(3) 出現-復元喩辞間の音素の相違にはどのような特徴があるか. その結果, 出現喩辞の音節数は先行と後続とで一致する場合が多いこと, 先行-後続出現喩辞間及び出現-復元喩辞間の音素の相違は比較的少なく, 相違がある場合もかなり高い規則性があること, などがわかった. 以上の知見から, 計算機による駄洒落理解手法, 即ち出現喩辞と復元喩辞を同定するアルゴリズムを構築できる見通しが得られた.
  • Hideto Tomabechi
    1995 年 2 巻 2 号 p. 23-58
    発行日: 1995/04/10
    公開日: 2011/03/01
    ジャーナル フリー
    Graph unification has become a central processing mechanism of many natural language systems due to the popularity of unification-based theories of computational linguistics. Despite the popularity of graph unification as the central processing mechanism, it remains the most expensive part of unification-based natural language processing. Graph unification alone often takes over 90% of total parsing time. As the criteria for efficient unification, we focus on two elements in the design of an efficient unification algorithm: 1) elimination of excessive copying and 2) quick detection of unification failures. We propose a scheme to attain these criteria without expensive overhead for reversing the changes made to the graph node structures based on the notion of quasi-destructive unification. Our experiments using an actual large scale grammar and also using a simulated grammar producing different unification success rates show that the quasi-destructive graph unification algorithm runs roughly at twice as fast as Wroblewski's non-destructive unification algorithm.
  • Hozumi Tanaka, Takenobu Tokunaga, Michio Aizawa
    1995 年 2 巻 2 号 p. 59-74
    発行日: 1995/04/10
    公開日: 2011/03/01
    ジャーナル フリー
    Morphological analysis of Japanese is very different from that of English, because no spaces are placed between words. This is also the case in many Asian languages such as Korean, Chinese, Thai and so forth. In the Indo-European family, some languages such as German have the same phenomena in forming complex noun phrases. Processing such languages requires the identification of the word boundaries in the first place. This process is often called segmentation. Segmentation is a very important process, since the wrong segmentation causes fatal errors in the later stages such as syntactic, semantic and contextual analysis. However, correct segmentation is not always possible only with morphological information. Syntactic, semantic and contextual information are also necessary to resolve the ambiguities in segmentation. This paper proposes a method to integrate the morphological and syntactic analysis based on LR parsing algorithm. An LR table derived from grammar rules is modified on the basis of connectabilities between two adjacent words. The modified LR table reflects both the morphological and syntactic constraints. Using the LR table and the generalized LR parsing algorithm, efficient morphological and syntactic analysis is available.
  • Satoshi Sekine
    1995 年 2 巻 2 号 p. 75-87
    発行日: 1995/04/10
    公開日: 2011/03/01
    ジャーナル フリー
    There have been a number of theoretical studies devoted to the notion of sublanguage. Furthermore, there are some successful natural language processing systemswhich have explicitly or implicitly utilized sublanguage restrictions. However, two big problems are still unsolved to utilize the sublanguage notion: 1) automatic definition and dynamic identification of a text to sublanguage, and 2) automatic linguistic knowledge acquisition for sublanguage. There are now new opportunities to address these problems owing to the appearance of large machine-readable corpora. Although there have been several experiments to try to solve the second problem listed above, the first problem has not received so much attention. In the previous sublanguage N. L. P. systems, the domain the system is dealing with was defined by a human. This is actually one method to define the sublanguage of a text, and, in a sense, it seems to work well. However, it is not always possible and sometimes it may be wrong. In order to maximize the benefit of the sublanguage notion, we need automatic definition and dynamic sublanguage identification. We will report preliminary experiments on sublanguage definition and identification based on lexical appearance. The results of the experiments show that the methods proposed can be useful in processing a new text. In particular, the fact that the first two sentences can reliably identify a text's sublanguage encourages us in further investigation of this line of research. In conclusion, it appears that the inductive definition of sublanguage and sublanguage identification would be beneficial for natural language processing.
feedback
Top