Online Program

Return to main conference page
Thursday, May 17
Machine Learning Applications
Thu, May 17, 6:15 PM - 7:15 PM
Regency Ballroom B
 

Random Forest Prediction Intervals (304706)

Dan Nettleton, Iowa State University 
Dan Nordman, Iowa State University 
*Haozhe Zhang, Iowa State University 

Keywords: random forest, prediction interval, out-of-bag, conformal prediction

Random forest methodology (Breiman, 2001) is one of the most popular machine learning techniques for prediction problems. An important but often overlooked challenge is the determination of prediction intervals that contain unobserved response values with specified probabilities. In this article, we propose a method for contructing prediction intervals from a single random forest and its byproducts. Under certain regularity conditions, we prove that the proposed intervals have asymptotically correct coverage rates. The finite-sample properties of the proposed intervals are compared via simulation with two existing approaches: quantile regression forests (Meinshausen, 2006) and split conformal intervals (Lei et al., 2017). The effects of tuning parameters on prediction interval performance are also explored in the simulation study. In addition, we analyzed 67 datasets from the UCI machine learning repository and Chipman et al. (2010). The numeric results demonstrate that intervals constructed with our proposed method have smaller interval widths than split conformal intervals, while both our intervals and split conformal intervals have more accurate and more robust marginal coverage rates than quantile regression forest intervals.