Abstract:
|
In a wide variety of contexts, both clinical and gene expression measurements are important components in constructing effective prognostic models of survival. To better utilize such high-dimensional patient covariates in the context of early-stage lung cancer, we propose a flexible ensemble survival prediction method which relies on Bayesian averaging of many tree-based accelerated failure time (AFT) models. The predictor space in each AFT model in our averaging scheme is derived from a specific gene clustering or grouping, and for each of these, the corresponding AFT model consists of Bayesian additive trees for the regression function and a mean-constrained nonparametric prior for the error distribution. Additive trees allow for substantial flexibility in modeling the relationship between patient covariate and survival times within each AFT model, and Bayesian model averaging allows for more efficient exploration of the covariate space, computation, improved survival prediction, and model assessment. We explore the use of our method using a multi-site study of lung cancer survival and discuss its use in determining the relative importance of key clinical and genetic features.
|