Abstract:
|
Identification of taxon-taxon associations from microbial sequencing data has long been hampered by the data’s high-dimensional nature, as well as the presence of excessive zeros. Novel methods require log-ratio transformations and Gaussian assumptions on the resulting data which are not satisfied, due to the zeros, and rely on estimates of taxa covariance matrices that are often unstable and bias, due to the high dimension and sparsity. We propose treating the dependence parameter of bivariate copula models, with mixed zero-beta margins, as a random variable drawn from a Gaussian distribution whose mean is the conserved co-variation structure. Parameter estimation and hypothesis testing are done via a Monte Carlo EM algorithm and a Monte Carlo likelihood ratio test, respectively. We illustrate our method using simulations and analysis of real microbiome data sets.
|