Abstract:
|
Compositional data consist of vectors of proportions derived from an unconstrained basis. Correlations between basis features are of long-standing interest in fields including ecology, but the resulting sum constraint makes inference on such correlations challenging due to the information loss from normalization. We propose a novel Bayesian framework (BAnOCC: Bayesian Analysis of Compositional Covariance) to impose shrinkage on the precision matrix through a LASSO prior. The resulting posterior distributions give coherent inference and allow inference on any function of the precision matrix, including the correlation matrix. We also use a first-order Taylor expansion to approximate the transformation from the basis to the composition and investigate what characteristics of the basis make the correlations more or less difficult to infer. On simulated datasets, BAnOCC infers the true network as well as previous methods while offering the advantage of posterior inference and maintaining good type I and type II error rates. Finally, BAnOCC reproduces established ecological results and reveals competition-based roles for Proteobacteria in a dataset from the Human Microbiome Project.
|