Online Program Home
My Program

Abstract Details

Activity Number: 601 - Recent Advances in Variable Selection for Linear and Nonlinear Models
Type: Topic Contributed
Date/Time: Thursday, August 1, 2019 : 8:30 AM to 10:20 AM
Sponsor: Biometrics Section
Abstract #304920 Presentation
Title: Optimized Variable Selection via Repeated Data Splitting
Author(s): Marinela Capanu* and Colin Begg and Mithat Gonen
Companies: Memorial Sloan Kettering Cancer Center and Memorial Sloan Kettering Cancer Center and Memorial Sloan Kettering Cancer Center
Keywords: variable selection; linear regression; data splitting; empirical threshold; variable screening

We introduce a new variable selection procedure that repeatedly splits the data into two sets, one for estimation and one for validation, to obtain an empirically optimized threshold which is then used to screen for variables to include in the final regression model. In an extensive simulation study we show that the proposed variable selection technique enjoys superior performance compared to candidate methods, being amongst those with the lowest inclusion of noisy predictors while having the highest power to detect the correct model and being unaffected by correlations among the predictors. We illustrate the methods by applying them to a cohort of patients undergoing hepatectomy at our institution.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program