Journal of the Japan Statistical Society, Japanese Issue
Online ISSN : 2189-1478
Print ISSN : 0389-5602
ISSN-L : 0389-5602
A UNIFORM TWO-ARMED BANDIT PROBLEM: THE PARAMETER OF ONE DISTRIBUTION IS KNOWN
Toshio Hamada
Author information
JOURNAL FREE ACCESS

1978 Volume 8 Issue 1 Pages 29-35

Details
Abstract

The sequential design procedure which selects one of two experiments e0 and e1 at each stage in order to maximize the sum of n observations is considered, where the distributions of e0 and e1 are uniform distributions on the intervals (0, u) and (0, 1) respectively. The value of u is unknown and is assumed to have a Pareto distribution with parameters w and α as a prior distribution. This problem is formulated by the dynamic programming and is solved recursively. The expected sum of n observations is given as function of w, α and n, and the optimal strategy is to perform e0 if and only if w_??_wn (α). wn (α) is strictly increasing in α and non-decreasing in n, and the value of wn (α) is calculated for several values of w and α and tabulated.

Content from these authors
© Japan Statistical Society
Previous article Next article
feedback
Top