Bulletin of Data Analysis of Japanese Classification Society
Online ISSN : 2434-3382
Print ISSN : 2186-4195
Article
Numerical Method to Extract Differential Features of News Quotations in Social Media
Takaharu Tsuda
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2016 Volume 5 Issue 1 Pages 49-66

Details
Abstract

There have been a number of researches to predict future events with information sourceson the internet, and they mostly utilize single or few information sources for prediction. Onereason of the small number of information sources is because it is not necessarily appropriateto assume that plural information sources have identical features in terms of contentsand that prohibits dealing all the information sources equally. In order to utilize variety ofinformation sources on the internet as possible, it would be essential to extract differentialfeatures of information sources, and evaluate how they relate one another. This paper assumesthat news from major Japanese newspaper publishers represent business facts, andblog and message board (hereafter ’social media’) contain evaluation of business facts. Itthen introduces analytical framework to extract differential features of each of the informationsources. Specifically, this paper hypothesizes that differential features are added in thesocial media’s news quotation onto original news, and discusses how to extract the featuresin a numerical manner. It also discusses features can be at the level of information sourcesper se as well as at more granular taxonomy level. Entropy term-weighting and Naive-Bayesclassifier are adopted to categorize contents and singular vector decomposition is adoptedto extract the differential features. These procedures are applied to the data relating toToyota motors, and result of the analysis not only supports the hypothesis but also exemplifieshow differential features at granular taxonomy level represent the company’s currentcharacteristics.

Content from these authors
© 2016 Japanese Classification Society
Previous article
feedback
Top