Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 245 - SLDS CSpeed 4
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #319114
Title: Tree Boosting for Learning Probability Measures
Author(s): Naoki Awaya* and Li Ma
Companies: Duke University and Duke University
Keywords: Tree ensemble; Unsupervised learning; Non-parameteric inference; Additive models

We propose a tree boosting method for learning high-dimensional probability distributions inspired by the success of tree boosting in high-dimensional classification and regression. The concepts of "addition'' and "residuals'' on probability distributions are formulated in terms of compositions of a new, more general notion of multivariate cumulative distribution functions (CDFs) than classical CDFs. This then gives rise to a simple forward-stagewise (FS)algorithm fitting of an additive ensemble of measures. The output of the FS algorithm allows analytic computation of the fitted probability density function and also provides an exact simulator to draw from the fitted measure. Numerical experiments confirm that boosting can substantially improve the fit to multivariate distributions compared to the state-of-the-art single-tree learner and is computationally efficient. We illustrate through an application to a data set from mass cytometry how the simulator can be used to investigate various aspects of the underlying distribution.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program