All Times ET
Keywords: Variable Importance, Regularization, Random Forests
Here we propose an augmented bagging (AugBagg) procedure, which performs bagging on an augmented feature space containing additional randomly generated noise features. Surprisingly and counterintuitively, this simple inclusion of noise features has implicit regularization effect, leading to improved model performance in low signal-to-noise ratio (SNR) settings. As a result, common notions of variable importance based on improvements in model accuracy can be fatefully flawed.