計測自動制御学会論文集
Online ISSN : 1883-8189
Print ISSN : 0453-4654
ISSN-L : 0453-4654
マルコフ過程の最適制御
小野 謙二木村 正行
著者情報
ジャーナル フリー

1969 年 5 巻 3 号 p. 273-278

詳細
抄録
This paper is concerned with a system which observes periodically one of a number of possible states with R values. After each observation, one of a possible number of actions with A decisions is made.
First, the discussion of such Markov processes is made in section 3·1 and then the work of R.A. Howard is extended. Next, the system relations considered are assumed to depend on a random vector, which forms an m-order homogeneous Markov chain of unknown transition probabilities. Namely, let {Vt}, t=0, 1, 2, … be a sequence on events which is selected as a random vector, Vt=(Ut, Yt) in this paper, and suppose Pr(Yt+1=yj|V0, V1, …, Vt, Ut+1=uk)=Pr(Yt+1=yj|Vt-m+1, …, Vt, Ut+1=uk) where {Ut}, t=0, 1, 2, …is a sequence of control actions and {Yt}, t=0, 1, 2, …is a sequence of measuremets. An algorithm is formulated and used in the design of a controller without an identification of the transition probabilities. Finally, the case that the order of the Markov chain is unknown is discussed in section 4.
著者関連情報
© 社団法人 計測自動制御学会
前の記事 次の記事
feedback
Top