Abstract:
|
Advances in technology have allowed for the inexpensive processing and analysis of high-throughput data. Non-Euclidean distance measures, like the Bray-Curtis dissimilarity measure, can be used to quantify and visualize the relationship between samples based on high-dimensional data, such as microbiome data. Confounders, such as batch effects, can obscure the true signal of the experiment, e.g. overall compositional differences between control and treatment groups. Here, we propose a two-step process for adjusting the mean and variance of both continuous and categorical confounders from the principal components of microbiome data. Using multiple real datasets, we apply this method to various visualization methods such as principal coordinates analysis (PCOA) and hierarchical clustering. Furthermore, we extend our two-step process to existing methods for evaluating multi-omics data (e.g. metabolomics and metagenomics).
|