Abstract:
|
Public health research has increasingly focused on microbiome data as a tool to understand health outcomes such as childhood allergies and digestive health. One goal of this work is to establish target areas for intervention by developing methods to identify factors that influence microbiome composition. Because microbiomes are made up of many species, a multivariate approach is essential. However, this is uniquely challenging because the data are high-dimensional and correlated, and we must be mindful of multiple testing concerns. We propose a Bayesian test for identifying factors that leverages both dependence between species and spatial structure commonly found in community data. Our method combines a nonparametric model for spatial dependence and a spike-and-slab model for feature selection. We present an application to the Wild Life of Our Homes citizen science project. The data contain presence-absence indicators for over 55,000 distinct fungal taxa and 170 household covariates from 1,300 sampling locations across the United States. Using the proposed method, we consider how housing design, geography, and human behavior affect the environment's microbial balance.
|