Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 168 - SLDS Student Paper Awards
Type: Topic Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #309844
Title: High-Dimensional Nonparametric Density Estimation via Max-Random Forest
Author(s): Yiliang Zhang* and Qi Long and Weijie Su
Companies: University of Pennsylvania and University of Pennsylvania and University of Pennsylvania
Keywords: nonparametric density estimation; random forest; high-dimensional statistics; conditional density estimation; missing data imputation

In this paper, we propose Max-Random Forest, a novel nonparametric density estimator inspired by random forest. It is computationally friendly and empirically has better performances compared with other existing methods. Especially it is capable of estimating a density with over 100 dimensions using one hundred thousand data with a reasonable error. Under Lipschitz assumptions, we provide a non-asymptotic bound on estimation error in squared Hellinger distance for a simplified version of the algorithm. We further modify Max-Random Forest into a conditional density estimator, which is capable of estimating and sampling from a high-dimensional conditional density with decent accuracy. We finally apply the conditional density estimator to missing data imputation. In real data cases where number of total dimensions is over 10 and that of missing dimensions is 5, Max-Random Forest enjoys competitive performance compared with state-of-the-art methods.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program