Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Volume 19, Issue 5
Displaying 1-5 of 5 articles from this issue
Preface
Paper
  • Seiji Tsuchiya, Motoyuki Suzuki, Fuji Ren, Hirokazu Watabe
    2012 Volume 19 Issue 5 Pages 367-379
    Published: December 14, 2012
    Released on J-STAGE: March 19, 2013
    JOURNAL FREE ACCESS
    Onomatopoeic words are frequently used for expression of rich presence. These words can be understood easily for native speakers. Therefore most of onomatopoetic words are not written in a national language dictionary, or only a part of meaning is described. On the other hand, it is hard to understand a meaning of onomatopoetic words for non-native speakers. They can neither feel a meaning of an onomatopoetic word nor look it up in a dictionary. In this paper, an estimation method of feeling of an onomatopoeic word has been proposed. The feeling of the onomatopoeic word is inferred by using several features, such as morae sequence pattern of a onomatopoeic word, feeling of each mora, and so on. From the experimental results, the estimation performance of the proposed method was 0.345 (F-value). It was approximately 80% of the estimation performance given by human (F-value was 0.427). It can be said that the proposed method is useful for supporting learners of onomatopoeic words.
    Download PDF (362K)
  • Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, Hitoshi Nishikawa
    2012 Volume 19 Issue 5 Pages 381-400
    Published: December 14, 2012
    Released on J-STAGE: March 19, 2013
    JOURNAL FREE ACCESS
    This paper presents grammatical error correction of Japanese particles written by foreign Japanese learners. Our method is based on discriminative sequence conversion, which corrects particle errors by substitution, insertion, or deletion. For this kind of error correction task, it is difficult to collect large learners’ corpora. We attempt to solve this problem based on a discriminative learning framework which uses the following two methods. First, language model probabilities obtained from large Japanese corpora are combined with n-gram binary features obtained from the learners’ corpora. This method is applied in order to measure the correctness of Japanese sentences. Second, automatically generated pseudo-error sentences are added to the learners’ corpora in order to enrich the corpora directly. Furthermore, we apply domain adaptation, in which the pseudo-error sentences (the source domain) are adapted to the real-error sentences (the target domain). Experimental results show that the recall rate has been improved by using both the language model probabilities and the n-gram binary features. Stable improvement has been achieved by using pseudo-error sentences with the domain adaptation.
    Download PDF (648K)
  • Yuka Emura, Yohei Seki
    2012 Volume 19 Issue 5 Pages 401-418
    Published: December 14, 2012
    Released on J-STAGE: March 19, 2013
    JOURNAL FREE ACCESS
    Many users use facemarks everyday in recent computer mediated communication environments such as e-mail, chatting, and Microblogs. Although facemarks are useful to express the emotion or communication intentions beyond natural language communication, many users feel difficult to choose the right one from lots of candidates according to the situation. We propose a method to recommend facemarks based on the estimation of emotions, communication, or motion types in texts written by users. Emotion, communication, or motion types are defined with Twitter corpus, and estimation system is implemented with k-NN. Five assessors evaluated the relevance of recommended facemarks for their intention, and found that 66.6% of facemarks for 91 tweets were recommended properly, which improved significantly over the recommendation only from emotion categories.
    Download PDF (400K)
  • Eiji Aramaki, Sachiko Masukawa, Mizuki Morita
    2012 Volume 19 Issue 5 Pages 419-435
    Published: December 14, 2012
    Released on J-STAGE: March 19, 2013
    JOURNAL FREE ACCESS
    With the recent rise in popularity and size of social media, there is a growing need for systems that can extract useful information from this amount of data. We address an issue of detecting influenza epidemics. Although previous methods rely mainly on the frequencies of the influenza related words, such methods had suffered from the noisy tweets that do not express influenza symptoms. To deal with this problem, this study proposed two methods. First, the sentence classifier judges whether a person really catches the influenza or not. Next, the infectious model closes a time gap between the people web activity and the illness period. In the experiments, the combination of two techniques achieved the high performance (correlation coefficient 0.910 to the number of the influenza patients). This result suggests that not only natural language processing but also disease study contributes to social media based surveillance.
    Download PDF (531K)
feedback
Top