Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 455 - Learning Under Nonstationarity
Type: Invited
Date/Time: Wednesday, August 10, 2022 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319272
Title: Bandit Learning with Endogenous Drift
Author(s): assaf zeevi*
Companies: columbia university
Keywords: bandits ; non-stationary ; regret ; stochastic
Abstract:

The multi-armed bandit (MAB) problem is a widely studied machine learning paradigm encapsulating the tension between exploration and exploitation in online decision making. In the classical stochastic MAB problem, arm reward distributions are fixed throughout the horizon of play, hence the notion of the "optimal arm" is time invariant. Needless to say, this assumed stationarity is often violated in many real world applications. In this talk we consider a specific instance of Bandit learning where there is endogenous change in the arm statistics. Our model is motivated by online service and matching platforms where arms represent platform participants who are prone to abandonment when not "sampled." This primitive injects a non-stationarity in the arm population which impedes the learnability of the "best arm," and necessitates learning policies that adapt to said changes. We will illustrate some of the subtleties in this problem setting and discuss limits of achievable performance as well as classes of algorithms that are nearly regret-optimal.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program