JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 251
Type: Contributed
Date/Time: Monday, August 4, 2014 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Mining
Abstract #313528
Title: An Investigation into the Effect of Selection Bias on Multiple Biomarker Models: A Simulation Study
Author(s): Tristan Grogan*+ and David Elashoff
Companies: University of California, Los Angeles and University of California, Los Angeles
Keywords: stepwise ; logistic regression ; variable selection ; AIC ; AUC ; selection bias
Abstract:

Background: Biomarker studies often utilize a panel of predictor variables to discriminate between cases and controls (e.g. cancer and benign samples). The goal is generally to come up with a subset of markers which distinguish the groups by maximizing an accuracy measure such as the area under the ROC curve (AUC). Automated variable selection techniques are useful because they are easy to implement and the pool of predictor variables is typically large. However, these techniques can lead to overoptimistic models which don't generalize well to external samples.

Methods: Logistic regression models were run on simulated data with varying sample sizes, number of markers, degree of correlation between markers, and variable selection techniques. The AUC was used to assess classification ability of the models and to quantify the selection bias effect.

Conclusions: Selection bias can be a significant issue when using automatic variable selection techniques even with a relatively modest number of predictors. We emphasize the importance of external validation and more careful model building techniques to help combat the over optimism created by automatic variable selection techniques.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.