Abstract:
|
We introduce a new random forest framework that has the potential to utilize an indexed low-dimensional structure to improve statistical efficiency. The innovations are three-fold. First, we introduce a new forest-based method, dimension reducing random forests, that adaptively determines the optimal linear combination split at each internal node by utilizing sufficient dimension reduction techniques. Unlike existing approaches, our method maintains computational efficiency without making restrictive assumptions about the global or local structure of the regression function. Second, viewing random forests as adaptive kernel generators we are able to learn a dimension reduced kernel that has the potential to improve the rate of convergence. Last, we introduce a new forest-based localized sliced inverse regression method that can capture the dimension reduction subspace as effectively as other sufficient dimension approaches. To illustrate the advantages of our method, we conduct extensive experiments on both synthetic and real datasets.
|