2018 Volume 2018 Issue FIN-020 Pages 82-
This paper aims to predict a company's financial index by analyzing articles about the company. The authors propose MultiMedLDA, which is one of supervised topic models. MultiMedLDA assumes that each document has two types of labels, discrete value label and continuous one. It models relation between each document and these labels, and predicts an unknown label based on known labels and the documents. Making use of not only documents but also the known labels, it improves prediction accuracy. We evaluated our model with data from the "Japan Company Handbook". Using comments for each company as a document, the type of industry as a discrete value label and the company's ROE (Return On Equity) as a continuous value label, we predicted the ROE in the evaluation.