Abstract:
|
Statistical methods for health-related microbial marker discovery have largely focused on differential abundance testing. This approach tests for microbe-health associations in marginal models, and cannot account for spurious findings that arise from ecological dependency structures. Alternatively, conditional methods that test for each microbe while accounting for others lack interpretability, as a microbe's conditional relative abundance is fixed due to the data's compositionality nature. We propose a novel null hypothesis to characterize microbe-health associations conditioning on renormalized communities. This defines meaningful conditional associations with the microbiome corresponding to interventions, and eliminates spurious false discoveries due to ecological interactions. We additionally design a conditional randomization procedure to test for such nulls, that a) controls false discovery rates and b) can flexibly incorporate state-of-art machine learning methods to achieve good power.
|