Online Program Home
My Program

Abstract Details

Activity Number: 35 - Applications of Nonparametric Methods
Type: Contributed
Date/Time: Sunday, July 28, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Nonparametric Statistics
Abstract #304318
Title: Randomized Allocation with Nonparametric Estimation for Contextual Multi-Armed Bandits with Delayed Rewards
Author(s): Sakshi Arya* and Yuhong Yang
Companies: University of Minnesota and University of Minnesota
Keywords: multi-armed bandit; rewards; histogram method; consistency; Allocation; delay
Abstract:

In most multi-armed bandit settings it is assumed that the rewards related to each arm are observed instantaneously. This is not realistic especially in the medical context as a doctor usually has to treat several patients before being able to observe the results of the first patient. In this work we develop an allocation rule for multi-armed bandit problem with covariates when there is delay in observing rewards. We show that strong consistency can be established for the proposed allocation rule using the histogram method for estimation, under reasonable restrictions on the probability distributions for the delays. The assumptions impose mild restrictions on the delays in the sense that they allow for the possibility of non-observance of some rewards as long as a certain proportion of rewards are obtained in finite time.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program