Abstract:
|
The Microbiome is increasingly recognized as an important aspect of the health of host species, involved in many biological pathways and processes. Taking advantage of high?throughput sequencing technologies, modern bacterial microbiome studies are metagenomic, interrogating thousands of taxa simultaneously. Several data analysis frameworks have been proposed for microbiome sequence read count data. However, there is still room for improvement. We introduce a zero?inflated beta?binomial to model the distribution of microbiome count data and to determine association with a continuous or categorical phenotype of interest. The approach can exploit the mean?variance relationship to improve power and adjust for covariates. The proposed method is a mixture model with two components: (i) a zero model accounting for excess zeros and (ii) a count model to capture the remaining component by beta?binomial regression, allowing for overdispersion effects. Simulation studies show that our proposed method effectively controls type I error and has higher power than competing methods to detect taxa associated with phenotype. An R package ZIBBSeqDiscovery is available on R CRAN.
|