Online Program Home
My Program

Abstract Details

Activity Number: 608
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319302 View Presentation
Title: ThrEEBoost: Thresholded Boosting for Variable Selection and Prediction via Estimating Equations
Author(s): Benjamin Brown*
Keywords: correlation ; GEE ; thresholding ; variable selection ; boosting

Most variable selection techniques for high-dimensional models are designed to be used in settings where observations are independent and completely observed. In this paper, we present ThrEEBoost (Thresholded EEBoost), a general-purpose variable selection technique which accommodates "messy data" that requires an estimating equation by replacing the gradient of the loss by an estimating function. Thresholding affects the number of regression coefficients updated at each step, yielding new variable selection paths. ThrEEBoost was evaluated using simulation studies to assess the effects of different thresholds on prediction error, sensitivity, and specificity under sparse and non-sparse true models with correlated continuous outcomes. We show that when the true model is sparse or complex, ThrEEBoost achieves similar or lower prediction error to EEBoost, respectively. The technique is illustrated in the problem of identifying predictors of weight change in a longitudinal nutrition study.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association