Online Program Home
  My Program

Abstract Details

Activity Number: 7 - New Developments in Predictive Modeling of High-Dimensional Data
Type: Invited
Date/Time: Sunday, July 30, 2017 : 2:00 PM to 3:50 PM
Sponsor: Council of Chapters
Abstract #322293 View Presentation
Title: Comparing Correlated Component Regression with Lasso for Variable Selection in Logistic Regression with High-Dimensional Data
Author(s): Jay Magidson*
Companies: Statistical innovations
Keywords: Regularization ; Variable Selection ; Correlated Component Regression ; Lasso ; Suppressor Variable ; Scale Invariance
Abstract:

Standard stepwise regression techniques that rely on p-values to select predictors often perform poorly with high dimensional data, especially when the number of candidate predictors (P) is large and exceeds the number of cases (P > N). Alternatively, sparse regression techniques that employ regularization strategies have been shown to yield reliable predictions with high dimensional data. In particular, Correlated Component Regression (CCR), which is scale invariant, and Lasso are two such regression methods, both of which utilize cross-validation as an alternative to p-values.

As pointed out by Magidson (2013), when existing, suppressor variables are often among the most important predictors in the regression. In this presentation, we simulate high dimensional data under the assumptions of 2-group linear discriminant analysis (LDA), where one of the important predictors is a suppressor variable. In evaluating the results, we find that CCR outperforms lasso in part because CCR is much more likely than lasso to include the suppressor variable among the final model predictors. We discuss unique features built into the CCR approach that might explain why this difference occurs.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association