Bulletin of the Computational Statistics of Japan
Online ISSN : 2189-9789
Print ISSN : 0914-8930
ISSN-L : 0914-8930
Reviews
STATISTICAL CLASSIFICATION AND VISUALIZATION BASED ON VARYING COEFFICIENTS MODEL FOR LONGITUDINAL TEXT DATA
Shizue IzumiKenichi SatohNoriyuki Kawano
Author information
JOURNAL FREE ACCESS

2015 Volume 28 Issue 1 Pages 81-92

Details
Abstract
Lately written texts to social networking services like Twitter and Facebook are attracted to attention as big data. And these texts can be treated as longitudinally observed text data. Extraction of the longitudinal trends of keyword appearance and its classification can summarize the changes of characteristics in longitudinal text data. We propose a analytical method of the longitudinally observed text data, with an application of the method of estimating semiparametric varying coefficients using a mixed effects model proposed by Satoh and Tonda (2013). Our method consists of series of analytical methods, estimating the probability of keyword appearance using a logistic regression for the keyword appearance in the longitudinally observed text data, and classifying and visualizing the longitudinal trends of keyword appearance using summary of predictors. Results from the analysis of Hiroshima Peace Declaration enabled us to describe the longitudinal trends of keyword appearance in the text data. And the time affected classification results and the keyword location are visualized in a two-dimensional scatter plot, which provided additional information on the analogy between two classifications and the degree of intimacy with keywords. Further some practical interpretations of the classified results with consideration of social background implied an appropriateness of our proposal.
Content from these authors
© 2015 Japanese Society of Computational Statistics
Previous article Next article
feedback
Top