Journal of Natural Language Processing

Preface

[title in Japanese]

[in Japanese]

2017 Volume 24 Issue 4 Pages 521-522
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.521

JOURNAL FREE ACCESS

Download PDF (155K)

Paper

Dialog Act Classification Using Features Intrinsic to Dialog Acts in an Open-Domain Conversation

Tomotaka Fukuoka, Kiyoaki Shirai

2017 Volume 24 Issue 4 Pages 523-547
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.523

JOURNAL FREE ACCESS

Show abstractHide abstract

The classification of dialog acts of user’s utterance is one of the important fundamental techniques in open-domain conversational systems. Most previous studies on the classification of dialog acts were based on supervised machine learning; however, the characteristics of individual dialog acts were not considered. Some features for machine learning may increase the accuracy of classification for a particular dialog act, whereas decrease the accuracy for other dialog acts. In this study, an appropriate feature set is defined for each dialog act to improve the performance of the classification of the dialog acts. First, 28 features are proposed as an initial set. Second, for each dialog act, an optimal set of the features is identified by removing ineffective features from the initial set. Third, binary classifiers that judge whether a dialog act is suitable for a given utterance are trained using the optimized feature set. Finally, one dialog act is chosen based on the results provided by the binary classifiers. The reliability of the judgment of the binary classifiers is also considered. Results of experiments showed that our proposed method significantly outperformed a baseline that was trained using a single feature set.

View full abstract

Download PDF (539K)
Building a Sentiment Dictionary for News Analytics using Stock Prices

Keiichi Goshima, Hiroshi Takahashi

2017 Volume 24 Issue 4 Pages 547-577
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.547

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a method of building a sentiment dictionary using only news and stock price data for textual analysis in finance. To obtain word polarity from stock price fluctuations, we calculate stock price returns following announcements of news articles. We constructed learners with support vector regression, using stock price returns as supervised labels of news articles, and built a sentiment dictionary by extracting word polarity from learners. Furthermore, we examined whether our sentiment dictionary is effective in classifying news articles as negative or positive. We found that our sentiment dictionary is also effective in classifying news articles provided by other news media other than news media we employed in constructing the algorithm. In addition, we found that it is difficult to classify news articles on a date that is two trading days away from the news announcement date.

View full abstract

Download PDF (693K)
Cross-lingual Product Recommendation System Using Collaborative Filtering

Kanako Komiya, Minoru Sasaki, Hiroyuki Shinnou, Yoshiyuki Kotani

2017 Volume 24 Issue 4 Pages 579-596
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.579

JOURNAL FREE ACCESS

Show abstractHide abstract

We developed a cross-lingual recommender system using collaborative filtering with English-Japanese translation pairs of product names to help non-Japanese buyers who speak English when they are visiting Japanese shopping websites. Customer purchase histories at an English shopping site and those at another Japanese shopping site were used for the experiments. Two experiments were conducted to evaluate the system. They were (1) two-fold cross validation where half of the translation pairs were masked and (2) experiments where all of the translation pairs were used. The precisions, recalls, and mean reciprocal ranks (MRRs) of the system were evaluated to assess the general performance of the recommender system in the first set of experiments. We investigated the effect formatting the translation pairs and the performance according to the type of feature value of the vectors (binary versus rating values). In contrast, the kind of items that were recommended in a more realistic scenario were shown in the second experiment. The results reveal that masked items were found more efficiently than when the bestseller recommender system was used and, further, that items only on the Japanese site that seemed to be related to the buyers’ interests could be found by the system in the more realistic scenario.

View full abstract

Download PDF (606K)
Multi-domain Adaptation for Statistical Machine Translation Based on Feature Augmentation

Kenji Imamura, Eiichiro Sumita

2017 Volume 24 Issue 4 Pages 597-618
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.597

JOURNAL FREE ACCESS

Show abstractHide abstract

Domain adaptation is a major challenge when machine translation is applied to practical tasks. In this study, we present domain adaptation methods for machine translation that assume multiple domains. The proposed methods combine two typesof models: a corpus-concatenated model covering multiple domains and single-domain models that are accurate but sparse in specific domains. We combine the advantages of both the models using feature augmentation for domain adaptation in machine learning; however, a conventional method of feature augmentation for machine translation uses a single model. Our experimental results show that the translation qualities of the proposed method improved or were at the same level as those of the single-domain models. The proposed method is extremely effective in low-resource domains. Even in domains having a million bilingual sentences, the translation quality was at least preserved and even improved in some domains. These results demonstrate that state-of-the-art domain adaptations can be realized with appropriate model selection and appropriate settings, even when standard log-linear models are used.

View full abstract

Download PDF (486K)
Hierarchical Sub-sentential Alignment with IBM Models for Statistical Phrase-based Machine Translation

Hao Wang, Yves Lepage

2017 Volume 24 Issue 4 Pages 619-646
Published: September 15, 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.5715/jnlp.24.619

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we describe a novel method for joint word alignment and symmetrization. Based on initial parameters from simple IBM models, we synchronously parse the parallel sentence pair under the framework of bracket transduction grammar constraints. Our 2-phase method can achieve nearly the same run-time as fast_align while delivering better alignments on distantly-related language pairs such as English–Japanese. We show how to integrate this method into a standard phrase-based SMT pipeline. Although the alignment quality results are mixed, by forcing all words to be aligned (1-to-many/many-to-1), our method significantly reduces the phrase table size with no difference in translation quality and even outperforms fast_align in some end-to-end translation experiments.

View full abstract

Download PDF (957K)

Register with J-STAGE for free!