Online Program Home
My Program

Abstract Details

Activity Number: 165
Type: Topic Contributed
Date/Time: Monday, August 1, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #319590 View Presentation
Title: Why Significant Variables Aren't Automatically Predictive
Author(s): Adeline Lo* and Herman Chernoff and Tian Zheng and Shaw-Hwa Lo
Companies: and Harvard University and Columbia University and Columbia University
Keywords: prediction ; GWAS ; significance ; interactions

A recent puzzle in the big data scientific literature is that an increase in explanatory variables found to be significantly correlated with an outcome variable doesn't necessarily lead to improvements in prediction. This problem occurs in both simple and complex data. We offer explanations and statistical insights into why higher significance doesn't automatically imply stronger predictivity and why variables with strong predictivity sometimes fail to be significant. We suggest shifting the research agenda toward searching for a criterion to locate highly predictive variables rather than highly significant variables. We offer an alternative approach, the Partition Retention method, which was effective in reducing prediction error from 30% to 8% on a long studied breast cancer data set.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association