JOURNAL OF THE JAPAN STATISTICAL SOCIETY
Online ISSN : 1348-6365
Print ISSN : 1882-2754
ISSN-L : 1348-6365
AN EXPONENTIAL TWO-ARMED BANDIT PROBLEM WITH ONE ARM KNOWN UNDER BATCH SAMPLING
Toshio Hamada
著者情報
ジャーナル フリー

1995 年 25 巻 2 号 p. 205-216

詳細
抄録

There are two kinds of experiments e0 and e1, and by performing e0 or e1, an observation is obtained from the exponential distribution with a parameter 1 or u, respectively. Although the true value of u is unknown, u has a gamma distribution as the prior distribution. The action ai (i=0, 1) is defined to select ei, and perform it simultaneously m times. An n-stage sequential decision problem, in which a0 or a1 is selected at each stage by considering the information obtained up to that stage in order to maximize the expected sum of mn observations, is constructed and formulated by dynamic programming and the optimal strategy is obtained.
The results of this paper illustrate how to calculate the critical value which has an important role in the optimal strategy numerically. The results also gives the optimal strategy for the integer-valued parameter case of the gamma two-armed bandit problem with one arm known.

著者関連情報
© Japan Statistical Society
前の記事
feedback
Top