Abstract:
|
Tuning the tree sizes in a random forest is not generally recommended, but we have found cases where the default node sizes are not adequate. However, tuning node size or CV-pruning trees within the forest is presently expensive. We develop a fast pruning method based on local likelihoods and a custom-developed information criterion (IC). The amount of pruning is controlled by adaptively adjusting the IC penalty. The method can automatically select the pruning level in roughly the same time as it takes to build the forest. In 13 example data sets, RMSE is never significantly increased compared to the default, and is sometimes significantly lowered by using this pruning.
|