Name: 2021 Joint Statistical Meetings
Start: 2021-08-08T07:00:00+00:00
End: 2021-08-12

Online Program Home
My Program

All Times EDT

Abstract Details

Activity Number:	139 - Recent Advances of Semi-Supervised Learning: Techniques and Applications
Type:	Invited
Date/Time:	Tuesday, August 10, 2021 : 10:00 AM to 11:50 AM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #316859
Title:	Semi-Supervised Off Policy Reinforcement Learning
Author(s):	Aaron Sonabend* and Tianxi Cai
Companies:	Harvard University and Harvard T.H. Chan School of Public Health
Keywords:	risk prediction; electronic health records; model mis-specification; high dimensional inference; semi-supervised learning
Abstract:	Reinforcement learning (RL) has shown great success in estimating sequential treatment regimes (STR) which account for patient heterogeneity. However, used as the reward for RL methods, health-outcome information is often not well coded but embedded in clinical notes. Extracting outcome information is a resource-intensive task, so most of the available well-annotated cohorts are small. We propose a semi-supervised learning (SSL) approach that efficiently leverages a small labeled dataset with true outcome observed, and a large unlabeled data with outcome surrogates. In particular, we propose a semi-supervised, efficient approach to Q-learning and doubly robust off policy value estimation. Generalizing SSL to STR brings interesting challenges: 1) Feature distribution for Q-learning is unknown as it includes previous outcomes. 2) The surrogate variables we leverage are predictive of the outcome but not informative to the optimal policy or value function. We provide theoretical results for our Q-function and value function estimators to understand the degree of efficiency gained from SSL. In addition, our method is robust to misspecification of the imputation models.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program

JSM 2021 Online Program

Abstract Details

American Statistical Association