Proceedings of the Annual Conference of the Institute of Systems, Control and Information Engineers
The 47th Annual Conference of the Institute of Systems, Control and Information Engineers
Conference information
Predictive Coding for Return Sequence Using Temporal Difference Learning
Kazunori IwataKazushi IkedaHideaki Sakai
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Pages 6013

Details
Abstract
We regard the sequence of returns as outputs from a parametric compound source. Utilizing the fact that the coding rate of the source shows the amount of information about the return, we describe l-learning algorithms based on the predictive coding idea for estimating an expected information gain concerning future information. Using the information gain, we propose the ratio w of return loss to information gain as a new criterion to be used in probabilistic action selection strategies. In experimental results, we found our w-based strategy performs well compared with the conventional Q-based strategy.
Content from these authors
© 2003 The Institute of Systems, Control and Information Engineers
Previous article Next article
feedback
Top