All Times EDT
Keywords: Regularization, Random Forests, Bagging, Model Selection, Degrees of Freedom
Random forests were proposed nearly two decades ago and remain among the most popular and successful supervised machine learning methods. Despite the well-established record of success, the mechanism behind which random forests work remain largely a mystery. Here we take a step in this direction and demonstrate that the unique randomness in random forests serves as an implicit form of regularization, similar to the penalty term in regularized linear model such as lasso and ridge and leading to the success of random forests in the low signal-to-noise ratio settings. Furthermore, we build analogues of random forests in the context of linear model forward selection procedure which exhibits surprising strong performance.