Name: 2022 Joint Statistical Meetings
Start: 2022-08-06T07:00:00+00:00
End: 2022-08-11
Location: Walter E. Washington Convention Center

Conference Program Home
My Program

All Times EDT

Abstract Details

Activity Number:	244 - Advances in Statistical Machine Learning
Type:	Contributed
Date/Time:	Tuesday, August 9, 2022 : 8:30 AM to 10:20 AM
Sponsor:	IMS
Abstract #322723
Title:	On Well-Posedness and Minimax Optimal Rates of Nonparametric Q-Function Estimation in Off-Policy Evaluation
Author(s):	Zhengling Qi*
Companies:	George Washington University
Keywords:	reinforcement learning; off-policy evaluation; minimax-optimal; sieve estimation
Abstract:	We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the Q-function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of Q-function estimation is well-posed in the sense of L2-measure of ill-posedness with respect to the data generating distribution, bypassing a strong assumption on the discount factor ? imposed in the recent literature for obtaining the L2 convergence rates of various Q-function estimators. Thanks to this new well-posed property, we derive the first minimax lower bounds for the convergence rates of nonparametric estimation of Q-function and its derivatives in both sup-norm and L2-norm, which are shown to be the same as those for the classical nonparametric regression (Stone, 1982). We then propose a sieve two-stage least squares estimator and establish its rate-optimality in both norms under some mild conditions.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program

JSM 2022 Conference Program

Abstract Details

American Statistical Association