Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 442 - Contributed Poster Presentations: Section on Statistics and Data Science Education
Type: Contributed
Date/Time: Wednesday, August 10, 2022 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics and Data Science Education
Abstract #322982
Title: Selective Inference in Practice
Author(s): Anni Hong* and Arun K Kuchibhotla
Companies: Carnegie Mellon University - Statistics dept. and Carnegie Mellon University
Keywords: Selective Inference; Post Selection Inference; Variable Selection; Variable Transformation; Multiple Testing; Reproducibility

The lack of replicability and reproducibility in research studies threatens the scientific community. We focus on the problem with selective inference in practice and how it can contribute to the unreliability of the research results. Researchers commonly select or "cherry-pick" models based on the observed data, then construct confidence intervals or tests on the model chosen. Without correcting for the selection procedure, this practice invalidates asymptotic statistical guarantees leading to higher false discovery rates than the researchers had assumed. Through simulation studies on a wide variety of publications, we demonstrate how standard selection procedures: variable selection (step-wise regression, LASSO, PCA) and variable transformation (Box-Cox, dichotomization of continuous variables) can invalidate statistical inference and result in higher false discovery rates. Acknowledging the necessity of model selection in practice, we outline practical remedies such as sample splitting and other corrections for multiple testing to guarantee valid post-selection inference.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program