Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 125 - Practical Recommendations for Prediction Modeling That Advance Innovation
Type: Invited
Date/Time: Monday, August 8, 2022 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319184
Title: Cross Validation for Small and Imbalanced Data
Author(s): Nathaniel O'Connell*
Companies: Wake Forest School of Medicine
Keywords: Machine Learning; Cross Validation; Random Forests; AUC; Statistical Learning; Prediction
Abstract:

The process of splitting data into training and validation sets for prediction model development is subject to variation, particularly in the presence of small and imbalanced data. Failure to appropriately account for this variability can decrease the validity and replicability of a prediction model, and lead to erroneous conclusions in a statistical comparison of competing prediction models’ performance metrics. This presentation will compare existing methods of cross validation in the context of small and imbalanced datasets. Stemming from this, we will discuss innovative strategies and statistical methods for comparing several prediction models in terms of performance metrics (e.g. AUC).


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program