Cross-validation is ubiquitous in data science, and is used for both model selection and assessment. Yet in some regards it is poorly understood. In this talk we discuss three aspects of CV.
1) What CV estimates. 2) Confidence intervals for prediction error using nested CV. 3) OOB error for random-forests and standard error estimates.
This talk is dedicated to Leo Breiman and Colin Mallows
|