自然言語処理

巻頭言

自然言語処理の可能性

福本淳一

2013 年20 巻4 号 p. 537
発行日: 2013/09/13
公開日: 2013/12/12

DOIhttps://doi.org/10.5715/jnlp.20.537

ジャーナルフリー

PDF形式でダウンロード (87K)

論文

複数の言語的特徴を用いた日本語述部の同義判定

泉朋子, 柴田知秀, 齋藤邦子, 松尾義博, 黒橋禎夫

2013 年20 巻4 号 p. 539-561
発行日: 2013/09/13
公開日: 2013/12/12

DOIhttps://doi.org/10.5715/jnlp.20.539

ジャーナルフリー

抄録を表示する抄録を非表示にする

大量のテキストから有益な情報を抽出するテキストマイニング技術では，ユーザの苦情や要望を表す述部表現の多様性が大きな問題となる．本稿では，同じ出来事を表している述部表現をまとめ上げるため，「メモリを消費している」と「メモリを食っている」の「消費している」と「食っている」のような述部表現を対象に，異なる 2 つの述部が同義か否かを認識する同義判定を行う．述部の言語構造分析をもとに，「辞書定義文」，「用言属性」，「分布類似度」，「機能表現」という複数の言語知識を用い，それらを素性とした識別学習で同義判定を行った．実験の結果，既存手法に比べ，高い精度で述部の同義性を判定することが可能になった．

抄録全体を表示

PDF形式でダウンロード (883K)
How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach

Michael Paul, Andrew Finch, Eiichiro Sumita

2013 年20 巻4 号 p. 563-583
発行日: 2013/09/13
公開日: 2013/12/12

DOIhttps://doi.org/10.5715/jnlp.20.563

ジャーナルフリー

抄録を表示する抄録を非表示にする

Recent research on multilingual statistical machine translation (SMT) focuses on the usage of pivot languages in order to overcome resource limitations for certain language pairs. This paper proposes a new method to translate a dialect language into a foreign language by integrating transliteration approaches based on Bayesian alignment (BA) models with pivot-based SMT approaches. The advantages of the proposed method with respect to standard SMT approaches are threefold: (1) it uses a standard language as the pivot language and acquires knowledge about the relation between dialects and a standard language automatically, (2) it avoids segmentation mismatches between the input and the translation model by mapping the character sequences of the dialect language to the word segmentation of the standard language, and (3) it reduces the translation task complexity by using monotone decoding techniques. Experiment results translating five Japanese dialects (Kumamoto, Kyoto, Nagoya, Okinawa, Osaka) into four Indo-European languages (English, German, Russian, Hindi) and two Asian languages (Chinese, Korean) revealed that the proposed method improves the translation quality of dialect translation tasks and outperforms standard pivot translation approaches concatenating SMT engines for the majority of the investigated language pairs.

抄録全体を表示

PDF形式でダウンロード (215K)
冗長性制約付きナップサック問題に基づく複数文書要約モデル

西川仁, 平尾努, 牧野俊朗, 松尾義博, 松本裕治

2013 年20 巻4 号 p. 585-612
発行日: 2013/09/13
公開日: 2013/12/12

DOIhttps://doi.org/10.5715/jnlp.20.585

ジャーナルフリー

抄録を表示する抄録を非表示にする

本論文では，複数文書要約を冗長性制約付きナップサック問題として捉える．この問題に基づく要約モデルは，ナップサック問題に基づく要約モデルに対し，冗長性を削減するための制約を加えることで得られる．この問題は NP 困難であり，計算量が大きいことから，高速に求解するための近似解法として，ラグランジュヒューリスティックに基づくデコーディングアルゴリズムを提案する．ROUGE に基づく評価によれば，我々の提案する要約モデルは，モデルの最適解において，最大被覆問題に基づく要約モデルを上回る性能を持つ．要約の速度に関しても評価を行い，我々の提案するデコーディングアルゴリズムは最大被覆問題に基づく要約モデルの最適解と同水準の近似解を，整数計画ソルバーと比べ100倍以上高速に発見できることがわかった．

抄録全体を表示

PDF形式でダウンロード (511K)

J-STAGEへの登録はこちら（無料）