Activity Number:
|
529
- SPEED: Machine Learning
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 2, 2017 : 10:30 AM to 11:15 AM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #325205
|
|
Title:
|
A Robust Residual-Based Approach for Random Forest Regression
|
Author(s):
|
Andrew Sage* and Ulrike Genschel and Dan Nettleton
|
Companies:
|
Iowa State University and Iowa State University and Iowa State University
|
Keywords:
|
random forest ;
robustness ;
contamination
|
Abstract:
|
We propose a method for improving robustness of random forest predictions when contamination of training data is suspected. While the localized nature of random forest methodology provides a level of robustness, previous work has shown that modifications to splitting criteria and aggregation schemes can enhance robustness. Our residual-based method, motivated by Cleveland's (1979) approach for locally weighted polynomial regression, iteratively recalculates random forest prediction weights for training cases in proportion to the residual resulting from each case's out-of-bag predicted value. Simulations show that this approach outperforms other robustness techniques in situations where the signal to noise ratio is low and contamination is prevalent.
|
Authors who are presenting talks have a * after their name.