Abstract:
|
With the recent surge in microbiome studies, there is increasing evidence that the human microbiota plays a crucial role in understanding health and disease. However, the applicability of standard variable selection methods is limited by the challenging structure of microbiome data. Microbiome features are typically quantified as operational taxonomic units (OTUs). They may have a similar impact on the outcome, as they share phylogenetic similarities. In addition, the OTU abundances are compositional, since the counts within each sample sum to a constant. To address these challenges in a linear regression setup with a continuous outcome, we proposed a Bayesian variable selection model with integrated structured priors that can both encourage the selection of similar microbiome features and handle the compositional constraint. In addition, as binary outcomes and survival data are very common in medical research, we extend the Bayesian linear regression framework to Bayesian logistic regression and survival models. We demonstrate better performance of our method compared with existing variable selection methods in both simulation and real data application.
|