Abstract:
|
We show how Bayesian additive regression trees (BART) can be used for multi-class classification under a multinomial logit response model by combining (multiple) ordinary BART(s) with a data augmentation scheme borrowed from logistic linear modeling. We illustrate how the implementation can exploit symmetric multiprocessor parallelization in both the BART subroutines, and in the sampling of latent variables via data augmentation. The result is a thrifty non-linear non-parametric classification model, where inference proceeds entirely by Gibbs sampling from ordinary distributions. We show that the method compares favorably to typical benchmark comparators, such as linear ones deploying expanded bases (also Gibbs) and Gaussian processes (via rejection/Metropolis schemes), yet at lower computational cost. The new techniques are illustrated on several synthetic problems, and on a real-data analysis arising from the estimation of player ability in ice hockey.
|