Online Program Home
My Program

Abstract Details

Activity Number: 250 - Topics in Statistical Learning
Type: Contributed
Date/Time: Monday, July 30, 2018 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #328738 Presentation
Title: Iterative Quantile Nearest-Neighbors
Author(s): Karsten Maurer*
Companies: Miami University
Keywords: Statistical Learning; Binning; Scalable; KNN

Conducting a nearest-neighbor search is a classic computational challenge that has applications within many fields, including predictive modeling. It is important to consider how the computational requirements of machine learning methods scale with increasing data set sizes. k-nearest-neighbor (KNN) models struggle to scale with the number of training observations. Approximate nearest-neighbors algorithms exist to improve the speed at which queries are able to run by pre-processing training data for efficient searches using tree-based structures. We propose iterative quantile nearest-neighbor (IQNN) models for classification and regression that provide an approximation to k-nearest-neighborhoods through the use of an iterative process of partitioning the feature space using empirical quantiles of feature variables, which are then stored as an R-tree of intervals. IQNN models demonstrate comparable predictive accuracy with KNN methods in empirical trials on many familiar machine learning example data sets and is shown through repeated simulation to be considerably more computationally efficient in cases with very large sample sizes.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program