We analyzed the correlation between topics in newspaper articles and stock indices to see if we could capture signs of economic changes from the text of the articles that report on various social events. First, we classified the words of 77,814 articles from a single Japanese newspaper over a 10-year period from 2003 to 2012 into 30 topics, and calculated the distribution of each topic on each day by Latent Dirichlet Allocation (LDA). Next, we calculated the rank correlation between the topics and the profitability and volatility of the 34 Tokyo Stock Exchange indices. As a result, we found that although there was no correlation between the topics and the profitability of stock indices, there was a significant time-dependent correlation between some topics and the volatility of the stock indices.
In this paper, we propose an LSTM for predicting Bitcoin price using Google Trends data and sentiment scores of news and tweets. We used sentiment scores of news weighted by the number of news as inputs. We also used sentiment scores of tweet sentiment scores weighted by tweet information, such as the number of likes, the number of retweets and the number of followers and used them as inputs. The results show better performance of the proposed method than using the non-weighted sentiment scores of news and tweets as inputs.
General domain pretrained large-scale language models, such as BERT and GPT3, have achieved state-of-the-art results among numerous NLP classification and generation applications. This pretraining technology is also willing to be used in vertical domains, such as finance. The downstream applications include financial event extraction from news, summarization, and causal inferencing. In this paper, we propose large-scale pretrained BERT models for financial domain in English and Japanese languages. The original datasets come from professional financial news. We empirically study the factors of sub-word vocabulary set, model size and their impacts to the downstream financial NLP applications. The code and pretrained models are released from https://github.com/NVIDIA/Megatron-LM.
Recent developments in deep learning techniques have motivated intensive research in machine learning-aided stock trading strategies. However, since the financial market has a highly non-stationary nature hindering the application of typical data-hungry machine learning methods, leveraging financial inductive biases is important to ensure better sample efficiency and robustness. In this study, we propose a novel method of constructing a portfolio based on predicting the distribution of a financial quantity called residual factors, which is known to be generally useful for hedging the risk exposure to common market factors. The key technical ingredients are twofold. First, we introduce a computationally efficient extraction method for the residual information, which can be easily combined with various prediction algorithms. Second, we propose a novel neural network architecture that allows us to incorporate widely acknowledged financial inductive biases such as amplitude invariance and time-scale invariance. We demonstrate the efficacy of our method on U.S. and Japanese stock market data. Through ablation experiments, we also verify that each individual technique contributes to improving the performance of trading strategies. We anticipate our techniques may have wide applications in various financial problems.
A lifetime of limit order is defined as the elapsed time from appear (submission) to disappear (cancellation or execution). Lifetime is a key of decision-making of traders because traders submit an order based on the trade-off between execution cost (how much price will be executed at) and delay risk (how long does it takes to be executed) and execution lifetime means the waiting time to execution and cancellation lifetime means the limit time of patience. Therefore, recovering power-law distribution of lifetimes of orders by an agent-based model (ABM) is a benchmark of time related decision-making of agents and contributes to constructing more advanced models. In this study, we created an ABM reproducing both of cancellation and execution lifetime distributions by extending our previous ABM doing only the distribution of cancelled orders.
Predicting the movement of stock price is an important issue for market participants. Recently, there have been many attempts applying machine learning techniques in financial time series prediction. However, overfitting presents a huge challenge when machine learning approaches are used in financial time series prediction. In this paper, we propose a stock price prediction method utilizing limit order book data from stocks other than target stocks by stratifying the data and holding a multi-phase pre-training considering market liquidity. Experimental results shows that the proposed approach enhances prediction performance.
Recently, many researchers have studied foreign exchange trading using technical analysis. However, it is difficult to achieve profitability using this technique. Therefore, using Genetic Network Programming, we construct a model that considers the technical index signal strength for devising a profitable trading strategy. Finally, we confirmed the effectiveness of our model using historical data of the exchange market.