Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 440 - SLDS CSpeed 8
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319080
Title: Posterior Sampling Algorithms for Sequential Decision-Making Based on Partially Observed Data
Author(s): Hongju Park* and Mohamad Kazem Shirani Faradonbeh
Companies: University of Georgia and University of Georgia
Keywords: Bayesian Decision-Making; Contextual Bandits; Exploration-Exploitation; Imperfect Observations; Posterior Sampling; Reinforcement Learning
Abstract:

Reinforcement learning algorithms for decision-making under uncertainty constitute important trends of research in data science. The broad objective is to design effective experiments and learn from the resulting data about efficient actions for maximizing the reward. A classical model is that of contextual bandits where reward depends on an unknown parameter as well as contexts corresponding to available actions. On the other hand, a popular algorithm is posterior sampling; it is easily implementable and enjoys theoretical performance guarantees. At every time step, the algorithm updates a posterior belief about the unknown parameter, acts as if a sample drawn from the posterior is the true parameter, collects the resulting data, and iterates these steps in a sequential manner. The problem is well-studied assuming perfect observation, but implementable algorithms with performance guarantees are not currently available for partially observed data. We introduce a novel Bayesian decision-making algorithm consisting of multi-layered posterior sampling steps. Theoretical analyses together with numerical illustrations of the performance of the proposed algorithm will be presented.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program