Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2014Volume 21Issue 5 Pages 979
Published: September 16, 2014
Released on J-STAGE: December 16, 2014

DOIhttps://doi.org/10.5715/jnlp.21.979

JOURNAL FREE ACCESS

Download PDF (96K)

Paper

A Generative Dependency N-gram Language Model: Unsupervised Parameter Estimation and Application

Chenchen Ding, Mikio Yamamoto

2014Volume 21Issue 5 Pages 981-1009
Published: September 16, 2014
Released on J-STAGE: December 16, 2014

DOIhttps://doi.org/10.5715/jnlp.21.981

JOURNAL FREE ACCESS

Show abstractHide abstract

We design a language model based on a generative dependency structure for sentences. The parameter of the model is the probability of a dependency N-gram, which is composed of lexical words with four types of extra tag used to model the dependency relation and valence. We further propose an unsupervised expectation-maximization algorithm for parameter estimation, in which all possible dependency structures of a sentence are considered. As the algorithm is language-independent, it can be used on a raw corpus from any language, without any part-of-speech annotation, tree-bank or trained parser. We conducted experiments using four languages, i.e., English, German, Spanish and Japanese, to illustrate the applicability and the properties of the proposed approach. We further apply the proposed approach to a Chinese microblog data set to extract and investigate Internet-based, non-standard lexical dependency features of user-generated content.

View full abstract

Download PDF (659K)
Unsupervised Domain Adaptations for Word Sense Disambiguation by Learning under Covariate Shift

Hiroyuki Shinnou, Minoru Sasaki

2014Volume 21Issue 5 Pages 1011-1035
Published: September 16, 2014
Released on J-STAGE: December 16, 2014

DOIhttps://doi.org/10.5715/jnlp.21.1011

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we apply the learning under covariate shift to the problem of unsupervised domain adaptation for word sense disambiguation (WSD). This learning is a type of weighted learning method, in which the probability density ratio w(x) = P_T(x)/P_S(x) is used as the weight of an instance. However, w(x) tends to be small in WSD tasks. In order to address this problem, we calculate w(x) by estimating P_T(x) and P_S(x), where P_S(x) is estimating by regarding the corpus combining the source domain corpus and target domain corpus as the source domain corpus. In the experiment, we use three domains -OC (Yahoo! Chiebukuro), PB (books) and PN (news papers)- in BCCWJ, and 16 target words provided by the Japanese WSD task in SemEval-2. For calculating w(x), we also use uLSIF, which directly estimates w(x) without estimating P_T(x) or P_S(x). Moreover, we use the “p power” method and the “relative probability density ratio” method to boost the obtained probability density ratio. These experiments prove our method to be effective.

View full abstract

Download PDF (544K)
Incremental Word Re-Ordering and Article Generation: Its Application to Japanese-to-English Machine Translation

Katsuhiko Hayashi, Katsuhito Sudoh, Hajime Tsukada, Jun Suzuki, Masaak ...

2014Volume 21Issue 5 Pages 1037-1057
Published: September 16, 2014
Released on J-STAGE: December 16, 2014

DOIhttps://doi.org/10.5715/jnlp.21.1037

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper introduces a novel word re-ordering model for statistical machine translation that employs a shift-reduce parser for inversion transduction grammars. The proposed model also solves article generation problems simultaneously with word re-ordering. We applied it to the post-ordering of phrase-based machine translation (PBMT) for Japanese-to-English patent translation tasks. Our experimental results suggest that our method achieves a significant improvement of +3.15 BLEU scores against 29.99 BLEU scores of the baseline PBMT system.

View full abstract

Download PDF (784K)

Survey paper

Introduction to Combinatorial Optimization: Model Building in Integer Programming

Shunji Umetani

2014Volume 21Issue 5 Pages 1059-1090
Published: September 16, 2014
Released on J-STAGE: December 16, 2014

DOIhttps://doi.org/10.5715/jnlp.21.1059

JOURNAL FREE ACCESS

Show abstractHide abstract

The integer programming (IP) model is a general-purpose optimization model that can formulate a surprisingly wide class of real applications using integer variables in linear programming (LP) models. Recent development in IP software systems has significantly improved our ability to solve large-scale instances. However, it is still difficult for most non-expert users to formulate real applications into IP models, because all conditions need to be written in linear inequalities. This paper demonstrates how to use IP software systems and formulate real applications into IP models.

View full abstract

Download PDF (728K)

Register with J-STAGE for free!