Consideration on the Learning Behaviours of Stochastic Automata

Norio BABA; Yoshikazu SAWARAGI

doi:10.9746/sicetr1965.10.78

抄録

This paper discusses the learning behaviours of variable-structure Stochastic Automata under stationary random environment.
A new reinforcement scheme (TNP) of the reward-penalty type which can ensure ε-optimality under all stationary random environments is proposed. It is proved, using Semi-Martingale Inequality and complex manipulations, that the TNP scheme can ensure ε-optimality. Moreover two reinforcement schemes (L_r-1 and T₁) which have been contrived are discussed from that point of view.
The TNP scheme is superior to L_r-1 and T₁ in the following respects ((1), (2)):
(1) The L_r-1 scheme is the reward-inaction type.
(2) The T₁ scheme can ensure ε-optimality only under certain conditions.
Computer simulation results also indicate that the TNP scheme accomplishes the most effective learning behaviour.

著者関連情報

お気に入り & アラート

お気に入りに追加
追加情報アラート
被引用アラート
認証解除アラート

閲覧履歴

Effects of Chronic Akt/mTOR Inhibition by Rapamycin on Mechanical Overload–Induced Hypertrophy and Myosin Heavy Chain Transition in Masseter Muscle
Effects of post-sintering on the axial surface fit of CAD/CAM fabricated zirconia frameworks
ブプレノルフィン経皮吸収製剤の予定外中断により急性退薬症状を来した在宅訪問診療患者の1例—社会的考察を含め—
The effects of Cu²⁺ on conformational changes of hPrP180-192 derived from the C-terminal region of prion protein
大型店出店規制の課題

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）