2P1-F30 連続な状態行動空間において学習可能なQ-learningの提案

山田 和明

doi:10.1299/jsmermd.2010._2P1-F30_1

抄録

This paper proposes the new Q-learning that can learn mapping from continue state spaces to continue action spaces. The proposed method estimates the expectation value of actions on a state by using artificial neural networks, and decides an action according to the distribution of the estimated expectation value. In this paper, we investigate the performance of the proposed method through two types of simple experimentations.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

Role of Adenosine in Long-Term Cultured Islets Maintained in Media with a Low Concentration of Glucose
Reduced expression of ABO blood group antigen and genetic alternation of ABO glycosyl transferase gene in cell line
Identification of the Strongest Die in Dueling Bandits

発行機関からのお知らせ

会員向け購読者番号とパスワードは以下URLよりご確認下さい。
https://www.jsme.or.jp/publication/proceedings/

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）