Activity Number:
|
245
- SLDS CSpeed 4
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #319114
|
|
Title:
|
Tree Boosting for Learning Probability Measures
|
Author(s):
|
Naoki Awaya* and Li Ma
|
Companies:
|
Duke University and Duke University
|
Keywords:
|
Tree ensemble;
Unsupervised learning;
Non-parameteric inference;
Additive models
|
Abstract:
|
We propose a tree boosting method for learning high-dimensional probability distributions inspired by the success of tree boosting in high-dimensional classification and regression. The concepts of "addition'' and "residuals'' on probability distributions are formulated in terms of compositions of a new, more general notion of multivariate cumulative distribution functions (CDFs) than classical CDFs. This then gives rise to a simple forward-stagewise (FS)algorithm fitting of an additive ensemble of measures. The output of the FS algorithm allows analytic computation of the fitted probability density function and also provides an exact simulator to draw from the fitted measure. Numerical experiments confirm that boosting can substantially improve the fit to multivariate distributions compared to the state-of-the-art single-tree learner and is computationally efficient. We illustrate through an application to a data set from mass cytometry how the simulator can be used to investigate various aspects of the underlying distribution.
|
Authors who are presenting talks have a * after their name.