Abstract:
|
A critical task in microbiome analysis is to identify microbial taxa that are associated with a response of interest. Most existing statistical methods examine the association between the response and one bug at a time, then followed by multiple testing adjustments such as false discovery rate (FDR) control. Despite feasibility, these methods are often underpowered due to unique characteristics of microbiome data, such as high-dimensionality, compositional constraint, and complex correlation structure. In this paper, we adopt the Knockoff Filter to provide finite sample false discovery rate control in the context of linear log-contrast models for regression analysis of compositional data. Instead of applying multiple testing corrections to many individual p-values, our framework achieved the FDR control in a regression model that jointly analyzes the whole microbiome community. By imposing an L1 regularization in the regression model, a subset of bugs is selected as related to the response under a preset FDR threshold. The method is demonstrated via simulation studies and is illustrated by an application to a recent study relating microbiome composition to host gene expressions.
|