Residual Reinforcement Learning (RRL) is a technique that combines a machine-learned policy and a human-designed policy. RRL generally assumes continuous action spaces and cannot be applied directly to discrete action spaces. In this study, we propose RRL in discrete action spaces based on Deep Q Network (DQN), and present the results of applying it to server load balancing.
In this study, we demonstrate whether analysts' sentiment toward individual stocks is useful for stock investment strategies. This can be achieved by using natural language processing to create polarity index from textual information in analyst reports. In this study, we performed time series forecasting for the created polarity index using deep learning, and clustered the forecasted values by volatility using a regime switching model. In addition, we constructed a portfolio from stock data and rebalanced it at each change point of the regime. As a result, the investment strategy proposed in this study outperforms the benchmark portfolio in terms of return. This suggests that the polarity index is useful for constructing stock investment strategies.
When performing action recognition in the real world, there are not only situations where one person is performing an action, but also cases where multiple people are performing other actions such as reading or operating a PC. Therefore, an action recognition method for one person is not sufficient, and an action recognition method for multiple persons is necessary. In addition, in order to recognize the interaction between a person and an object, such as "reading a book," it is necessary to consider not only the person but also the situation around the person. In this study, we attempt to apply the action recognition method that takes into account the situation around the person to multiple persons.
In order to continuously support the improvement of the value of the event, we conducted a demonstration experiment using "Web Tour" that enables the user experience of real events to be provided even at online events. We paid attention to the interaction between visitors and exhibitors, acquired the important items of both through the Web, and constructed a probabilistic model using PLSA and Bayesian network. We recommended the content to the visitors and showed the exhibitors the characteristics of the visitors. Then, we constructed a probabilistic model in which the degree of achievement of both objectives was used as an evaluation index of the provided value and used it as an objective variable, and examined how to utilize it.