電気学会論文誌C(電子・情報・システム部門誌)
Online ISSN : 1348-8155
Print ISSN : 0385-4221
ISSN-L : 0385-4221
<知能,ロボティクス>
逆強化学習における学習効率を最大化する報酬関数の推定
北里 勇樹荒井 幸代
著者情報
キーワード: 逆強化学習, 学習効率
ジャーナル フリー

2018 年 138 巻 6 号 p. 720-727

詳細
抄録

Inverse Reinforcement Learning (IRL) is a promising framework for estimating a reward function under given behaviors of the expert. However, the IRL problem is ill-posed in that several reward functions that can reproduce expert's behavior will be available. The previous studies of IRL have just focused on the reproduction rate of original behavior of expert's to select the most appropriate reward function. This evaluation measure seems not enough to shape the candidate of reward functions. To select the most appropriate one from the alternative reward functions, we introduce another objective function into the existing IRL algorithms of Ng et al. Specifically, we focus on the learning efficiency as an additional objective function to make the faster convergence of RL via introducing Genetic Algorithm. Consequently, our proposed IRL algorithm guarantees to output the reward function by which agent acquires both effective and optimal policy. We show the effectiveness of our approach by comparing the performance of the proposed method to those of the previous algorithms.

著者関連情報
© 2018 電気学会
前の記事 次の記事
feedback
Top