実時間探索による経路学習

石田 亨; 新保 仁

doi:10.11517/jjsai.11.3_411

抄録

The capability of learning is one of the salient features of realtime search algorithms such as LRTA. These algorithms repeatedly perform problem solving trials so that the heuristic values will eventually converge to exact values along every optimal path to the goal. The major impediment is, however, the instability of the solution quality (the length of the solution path) during convergence. This instability is due to two properties of the search algorithms:(1) they try to find all optimal solutions even after obtaining fairly good solutions, and (2) they tend to move towards unexplored areas thus failing to balance exploration and exploitation. In this paper, we propose and analyze two new realtime search algorithms to stabilize the convergence process. ・ε-search (weighted realtime search) relaxes the condition of searching for optimal solutions to allow suboptimal solutions with ε error. As a result, ε-search significantly reduces the total amount of learning performed. ・ε-search (realtime search with upper bounds) utilizes the upper bounds of estimated costs, which become available after the problem is solved once. Guided by the upper bounds, δ-search can better control the tradeoff between exploration and exploitation. The ε-and δ-search algorithms can be combined easily. The effectiveness of these algorithms is demonstrated by solving randomly generated mazes.

著者関連情報

お気に入り & アラート

閲覧履歴

発行機関からのお知らせ

PDF閲覧時に認証を求められる記事がございます（発行後2年間）が，人工知能学会の個人会員は無料で閲覧可能です．認証のための購読者番号やパスワードは会員マイページ（ユース会員の場合はジュニア・ユース会員サイト）にログインし「お知らせ」にてご確認下さい（会員情報管理システムとオンラインで連携していないため，パスワードは同システムとは異なります．また，認証情報の更新は偶数月の月末に実施しております．新規入会された方は利用できるまでしばらくお待ちください）．個人会員以外は記事複製申込フォームから購入いただけます．また，アマゾンにて冊子版あるいはKindle版を購入いただくことも可能です．

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）