Abstract:
|
Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery tasks in the context of human diseases and healthy conditions. We present a recently developed machine learning framework for metagenomics (MetAML) and its use for metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We uniformly processed 2571 metagenomes from 12 large-scale studies to independently evaluate the prediction capabilities of metagenomic models and to compare strategies for practical use of the microbiome as a prediction tool. Additionally, we present curatedMetagenomicData, a Bioconductor resource providing thousands of processed metagenomic profiles from publicly available datasets, and ExperimentHub, for convenient cloud-based distribution of these data to the R desktop. The package provides standardized metadata linked to taxonomic and metabolic functional profiles. The resulting datasets can be immediately analyzed with a wide range of statistical methods, requiring a minimum of bioinformatic expertise and no preprocessing of data.
|