Online Program Home
My Program

Abstract Details

Activity Number: 606
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #320330 View Presentation
Title: Collaborative Targeted Learning for Large-Scale and High-Dimensional Data
Author(s): Cheng Ju* and Mark van der Laan and Susan Gruber and Jessica Franklin and Richard Wyss and Wesley Eddings and Sebastian Schneeweiss
Companies: University of California at Berkeley and University of California at Berkeley and Harvard T.H. Chan School of Public Health and Brigham and Women's Hospital and Brigham and Women's Hospital and Brigham and Women's Hospital and Brigham and Women's Hospital
Keywords: causal inference ; cross-validation ; model selection ; variable importance ; targeted maximum likelihood estimation

Collaborative double robust targeted maximum likelihood estimator (C-TMLE) is an extension of targeted maximum likelihood estimators (TMLE) that pursues an optimal strategy for estimation of the nuisance parameter required in the targeting step. We consider the problem of estimation of the average causal effect based on an observational study in which we observe on each unit baseline covariates, a binary treatment and an outcome of interest. A forward stepwise variable selection procedure based on this latter criterion was proposed by van der Laan and Gruber (2010). This C-TMLE was shown to outperform a standard TMLE when there are variables that are highly correlated with treatment but non-predictive of the outcome. However, its computational complexity is quadratic in the number of variables, which makes this particular C-TMLE not scalable for large scale and high dimensional data. In this article, we propose several scalable versions of C-TMLE: instead of using a greedy search at each step, it follows an easy to compute data adaptive ordering of the variables. Simulations are provided to illustrate the performance of these scalable C-TMLEs relative to current competitors.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association