Abstract:
|
Modern learning algorithms are often seen as prediction-only tools, meaning that the interpretability provided by traditional models is sacrificed for predictive accuracy. We argue that this black-box perspective need not be the case by developing formal statistical inference procedures for predictions generated by supervised learning ensembles. Ensemble methods based on bootstrapping like random forests usually improve the accuracy and stability of individual trees, but fail to provide a framework in which distributional results can be easily determined. Instead of aggregating full bootstrap samples, we consider a general resampling framework in which predictions are averaged over trees built on subsamples and demonstrate that the resulting estimator belongs to an extended class of U-statistics. We develop a corresponding central limit theorem allowing for confidence intervals to accompany predictions, as well as formal hypothesis tests for feature significance and additivity. Moreover, the internal estimation method we suggest allows for these inference procedures to be carried out at no additional computational cost. Demonstrations are provided on the ebird citizen science data.
|