Abstract:
|
As an ensemble of decision trees method, Bayesian additive regression trees (BART) has the ability to detect and model non-linear relationships and complex interactions automatically. Its excellent predictive performance has been demonstrated in a wide range of data structures. However, BART predictions often have an inflated variation compared to other models, especially in noisy settings such as binary outcomes in classification problems. In this paper, we focus on binary outcomes and introduce two approaches to integrate BART with Bayesian probit linear regression, to stabilize the predictions and shrink the variance of the mean function. The first approach uses the linear model as a centering function, and applies an enhanced version of BART to the residuals which adapts to the degree of non-linearity in the function. The second approach uses Bayesian model averaging, with pseudo-marginal likelihood to determine the weights. The performance of both methods are evaluated and compared on simulated data. A real data study of safety outcomes for voluntary unrelated donors providing hematopoietic stem cells for a transplant demonstrates the proposed methods.
|