Abstract:
|
The emergence of microbiome multi-omics studies calls for effective statistical methods to infer the effects of microbiome measures on another omics data type (e.g. metabolomics). To perform regression when both the response and predictors are high dimensional, reduced rank regression (RRR) has been extensively studied and has proven useful for integrating other types of omics data. However, existing RRR methods fail to account for the compositionality of microbiome data. Specifically, using microbiome sequencing, the absolute abundances (AA) of microbes are unobservable and microbiome composition is only characterized by the relative abundance (RA) of a microbe relative to other microbes. To resolve this challenge, we propose a new multivariate regression method to estimate the effects of microbiome absolute abundances which necessitates RA data only. Our model explicitly incorporates the unknown total microbial abundance into a reduced rank regression model with nuclear penalty. An ADMM-based algorithm is developed to estimate the microbiome-response association matrix. The advantages of our methods are shown in simulation studies in both estimation and prediction.
|