Abstract:
|
Many of the real applications prevalent in the modern data science involve heterogeneous and mixed data (e.g. count, binary, continuous, skewed continuous, among other data types). In this talk we will consider hierarchical Bayesian models for high-dimensional count data that incorporate variable selection. Zero-inflation, skewness, and overdispersion all cause difficulties when modeling count data. In this talk I will consider Bayesian Dirichlet-Multinomial regression models which use spike-and-slab priors for the selection of significant association between microbiome abundances and a set of covariates. If time allows, I will also describe negative binomial mixture regression models for the analysis of sequence counts and methylation data. In addition to feature selection, models include priors that capture structural dependencies among the variables.
|