Abstract
Several decision problems such as bandit problems can be considered as special sequential two-action Markov decision models as described in [2]. In this paper a uniform two-armed bandit problem with one arm known is studied by embedding it in the general framework developed in [2]. Two cases of this problem are examined. The first case assumes that one end point of the uniformity interval of the unknown arm is Pareto distributed. In the second case the joint distribution of the two end points of the uniformity interval of the unknown arm is bilateral Pareto. The results obtained extend and complete those obtained in [6, 7].