Abstract:
|
We introduce a probabilistic model, called the ``logistic-tree normal'' (LTN), for microbiome compositional data. The LTN marries two popular classes of models---the logistic-normal (LN) and the Dirichlet-tree (DT)---and inherits the key benefits of both. LN models are desirable in their ability to characterize rich covariance structure among taxa but can be computationally prohibitive when the number of taxa is large due to its lack of conjugacy to the multinomial sampling model. The DT addresses the computational difficulty of the LN by maintaining conjugacy, but incurs very restrictive covariance among the taxa. Our new LTN model decomposes the multinomial model into binomials along the phylogenetic tree as the DT does, but in constrast jointly models the corresponding binomial probabilities using a (multivariate) LN distribution to allow flexibility covariance. The likelihood decomposition makes it possible to restore conjugacy using PĆ³lya-Gamma data augmentation, thereby tackling the computational issues of LN models without sacrificing its flexibility as the DT model does. We demonstrate the broad applicability of the LTN in covariance estimation and mixed-effects modeling.
|