Despite the popularity of tree-based ensembles (bagging, boosting, random forests), these methods are often seen as prediction-only tools whereby the interpretability of traditional statistical models is sacrificed for predictive accuracy. We present recent work that suggests this black-box perspective need not always be the case. We consider a general resampling scheme based on subsampling and demonstrate that the resulting predictions can be seen as the equivalent of U-statistic estimators. As such, central limit theorems can be developed allowing for confidence intervals and hypothesis tests to be produced. Furthermore, the proposed test statistics can be seen as a natural and consistent measure of variable importance that, unlike the popular out-of-bag (OOB) measures, is robust to covariate correlation structures. We demonstrate results on ebird citizen science data and numerous other publically available datasets that suggest these alternative importance measures operate in a familiar fashion and can provide appreciable insights typically hidden via classic measures based on OOB error.