Abstract:
|
The human microbiome plays a critical role in human health and disease. Microbiome data are usually represented as compositions, residing in a simplex that does not admit the standard Euclidean geometry. Furthermore, there exists a tree-structured taxonomic relationship among the microbes, which can be examined through a phylogenetic tree. Existing regression methods are inadequate in modeling compositional data or accounting for the taxonomic structure. To address the issues, we develop a novel taxonomy-regularized relative-shift regression paradigm for compositional data with tree structure. We will directly use compositions as predictors without any transformation, and exploit a tree-guided regularization method to encourage feature aggregation. The new paradigm adaptively determines which organisms and which taxonomic ranks contribute to the outcome. Extensive numerical studies demonstrate the efficacy of the proposed method.
|