Abstract:
|
Accurate estimates of internal validity are important to correctly guide both model selection and decisions about whether and how a prediction model should be used in clinical practice. Split-sample validation, in which the entire available sample is randomly divided into subsets used exclusively for model estimation (“training”) or validation (“testing”), is commonly used for internal validation. But, using only a fraction of the available observations exclusively for training and the remainder exclusively for validation reduces the statistical power of both tasks. Internal validation methods have been proposed that use the entire available dataset for both model estimation and validation, including cross-validation and bootstrap optimism correction. Demonstrations of bootstrap optimism correction for continuous risk prediction have been limited to logistic regression models predicting relatively common events with a small number of predictors. In this presentation, we compare internal validation methods for a random forest prediction model estimating a very rare event, suicide risk following an outpatient mental health visit.
|