2013 Volume 2013 Issue FIN-010 Pages 03-
One is numerical information, including past stock prices, currency exchange rates, and interest rates. The other is textual information, mainly news stories covering statements of government dignitaries, consumer trends, and miscellaneous events. Although numerical information has been proven useful for predicting stock prices, its predictive power is limited in a sense that much information, such as the statements of government dignitaries mentioned above, resides only in textual information. Given a stock for which one would like to predict the future price, textual data provides different?but much wider coverage?of information from numerical information, which may be beneficial in prediction. This study exploits public Web news articles and attempts to estimate the residue that cannot be explained only by numerical information through a simple additive regression model. In addition, to distinguish between different types of news articles, such as those specific to particular companies or types of industry and more general top news, the framework of multiple kernel learning is adopted. The validity and effectiveness of the proposed approach is evaluated on the real-world data consisting of share prices of Nikkei 220 companies and 47 thousand Web news articles.