Abstract:
|
Next generation sequencing technology opens a new era for microbiome research by direct sequencing of the microbial DNA. One approach sequences the bacterial 16S rRNA gene to profile the bacterial content of the human microbiome, resulting in a sparse abundance table of the detected bacterial species. One important characteristic of the 16S data is that the bacterial species are related by a phylogenetic tree. Utilizing the phylogenic tree information is critical for meaningful analysis of 16S data since closely related species tend to exhibit similar biological characteristics. I will present a unified framework for utilizing the phylogenetic tree in statistical analysis of 16S data under different contexts. To incorporate the phylogeny in predictive model, I will introduce phylogeny-constrained sparse regression model as well as a phylogeny kernel-based regression method. To improve the power of detecting differentially abundant species, I propose a phylogeny-based false discovery rate control procedure, which adjusts the signal based on the states of neighboring species. Simulation as well as real 16S data will be used to illustrate these methods.
|