Targeted Learning of the Optimal Dynamic Treatment and Statistical Inference for the Mean Outcome Under the Optimal Dynamic Treatment
*Mark van der Laan, UC Berkeley
Keywords: cross-validation, dynamic treatment, efficient, super-learning, targeted minimum loss-based estimation
Suppose we observe n independent and identically distributed observations of a time-dependent random variable consisting of baseline covariates, initial treatment and censoring indicator, intermediate covariates, subsequent treatment and censoring indicator, and a final outcome. For example, this could be data generated by a sequentially randomized controlled trial, where subjects are sequentially randomized to a first line and second line treatment, possibly assigned in response to an intermediate biomarker, and are subject to right-censoring. We define a super-learner (ensemble learning based on cross-validation) of the optimal dynamic multiple time-point treatment rule defined as the rule that maximizes the mean outcome under the dynamic treatment, where the candidate rules are restricted to only respond to a user-supplied subset of the baseline and intermediate covariates. In addition, we provide a targeted minimum loss-based estimator of the mean outcome under the optimal rule, with corresponding statistical inference. Our work is carried out under a nonparametric model that makes, at most, assumptions on the censoring/treatment mechanism.