Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 59 - Invited E-Poster Session I
Type: Invited
Date/Time: Sunday, August 8, 2021 : 5:45 PM to 6:30 PM
Sponsor: ENAR
Abstract #317250
Title: Post-Prediction Inference: A Wide Open Opportunity for Statisticians
Author(s): Siruo Wang and Tyler McCormick and Jeffrey Leek*
Companies: John Hopkins University and University of Washington and Johns Hopkins Bloomberg School of Public Health
Keywords: post-prediction inference; post-selection inference; machine learning; bootstrap
Abstract:

Machine learning is now being used across the entire scientific enterprise. Researchers commonly use the predictions from random forests or deep neural networks in downstream statistical analysis as if they were observed data. We show that this approach can lead to extreme bias and uncontrolled variance in downstream statistical models. We propose a statistical adjustment to correct biased inference in regression models using predicted outcomes—regardless of the machine-learning model used to make those predictions.

This is also the first crack at a big open problem in statistics - what do we do with machine learned outcomes? covariates? both? I think there is a ton for (bio)statistics students to sink their teeth into as well!


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program