Abstract:
|
Random forest, a variant version of bagging, is a decision tree-based ensemble method that uses random feature subset instead of all features when tree is constructed. In this article, we propose a new classification ensemble method named Double Random Forest(DRF). The new method, a modified version of random forest, is decision tree-based new ensemble method. Our new method don't use bootstrapping process at data level to use as many instances as possible and uses bootstrapped sample at each node of tree to use not only random feature subsets but also random instances subsets when both the best split features and split values are determined. This modified bootstrapping approach is to encourage simultaneously individual accuracy and diversity within ensemble compared to random forest and the other ensemble methods. To compare the performance our method and other widely used ensemble methods, we tested them on 29 real data sets. DRF performed significantly better in accuracy than other ensemble methods in most data sets.
|