Online Program Home
My Program

Abstract Details

Activity Number: 590
Type: Invited
Date/Time: Wednesday, August 3, 2016 : 2:00 PM to 3:50 PM
Sponsor: Government Statistics Section
Abstract #318316
Title: Supervised Neighborhoods for Distributed Nonparametric Regression
Author(s): Ameet Talwalkar*
Companies: University of California at Los Angeles
Keywords: Locally Linear Models ; Adaptive Nearest Neighbors ; Distributed nonparametric regression ; Supervised Neighborhoods ; Distributed Random Forests ; Apache Spark/MLlib
Abstract:

Techniques for nonparametric regression based on fitting small-scale local models at prediction time have long been studied in statistics and pattern recognition, but have received less attention in modern large-scale machine learning applications. In practice, such methods are generally applied to low-dimensional problems, but may falter with high-dimensional predictors if they use a Euclidean distance-based kernel. We propose a new method, Silo-RF, for fitting prediction-time local models that uses supervised neighborhoods that adapt to the local shape of the regression surface. To learn such neighborhoods, we use a weight function between points derived from random forests. We prove the consistency of Silo-RF, and demonstrate through simulations and real data that our method works well in both the serial and distributed settings. In the latter case, Silo-RF learns the weighting function in a divide-and-conquer manner, entirely avoiding communication at training time.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association