Name: 2019 Joint Statistical Meetings
Start: 2019-07-27T07:00:00+00:00
End: 2019-08-01
Location: Colorado Convention Center

Activity Number:	213 - Sequential Decision Making and Causal Inference
Type:	Invited
Date/Time:	Monday, July 29, 2019 : 2:00 PM to 3:50 PM
Sponsor:	IMS
Abstract #300158
Title:	Mostly Exploration-Free Algorithms for Contextual Bandits
Author(s):	Mohsen Bayati*
Companies:	Stanford University
Keywords:	Contextual Bandits; Greedy algorithms
Abstract:	The contextual bandit literature has traditionally focused on algorithms that address the exploration-exploitation tradeoff. However, exploration-free greedy algorithms are desirable in practical settings where exploration may be costly or unethical (e.g., clinical trials). Surprisingly, we find that a simple greedy algorithm can be rate-optimal if there is sufficient randomness in the observed contexts. Furthermore, even absent this condition, we show that a greedy algorithm can be rate optimal with positive probability. Thus, standard bandit algorithms may unnecessarily explore. Motivated by these results, we introduce Greedy-First, a new algorithm that uses only observed contexts and rewards to determine whether to follow a greedy algorithm or to explore. We prove that this algorithm is rate-optimal without any additional assumptions on the context distribution or the number of arms. Extensive simulations demonstrate that Greedy-First successfully reduces exploration and outperforms existing (exploration-based) contextual bandit algorithms such as Thompson sampling or upper confidence bound (UCB)

Authors who are presenting talks have a * after their name.

JSM 2019 Online Program