Online Program

Return to main conference page

All Times ET

Thursday, June 3
Practice and Applications
Classification and Simulation: Methods, Analyses, and Applications
Thu, Jun 3, 10:00 AM - 11:35 AM
TBD
 

An empirical Bayes approach to estimating dynamic models of co-regulated gene expression (309810)

Sumanta Basu, Cornell University 
Myung-Hee Lee, Weill Cornell Medical College 
*Sara Venkatraman, Cornell University 
Martin Wells, Cornell University 

Keywords: Bayesian regression, ordinary differential equations, dimensionality reduction, omics/genetics studies

Time-course gene expression datasets provide insight into the dynamics of complex biological processes, such as immune response and disease progression. It is of interest to identify genes with similar expression patterns over time because such genes often share similar biological characteristics. However, this task is challenging due to the high dimensionality of gene expression datasets and the nonlinearity of gene expression time dynamics. We propose a Bayesian approach to estimating ordinary differential equation (ODE) models of gene expression, from which we derive metrics that capture the similarity in the time dynamics of two genes. These metrics can be used to generate clusters and networks of closely-related genes. The key feature of our method is that it leverages biological databases that document known interactions between genes; this information is automatically used to define informative prior probability distributions on the parameters of the ODE model, thus allowing the resulting similarity metrics between genes to be biologically-informed. We balance the ODE model’s fit to both the data and external biological information using optimal, data-driven shrinkage parameters that we derive from Stein’s unbiased risk estimate. Using real gene expression datasets collected from fruit flies, we demonstrate that our approach encourages gene pairs with known biological associations to receive high similarity scores, and identifies under-studied genes whose time dynamics are strikingly similar to more well-studied ones. From these similarity scores, we recover sparse gene networks with clear biological interpretations. Our method is thus able to reduce the dimensionality of gene expression datasets and reveal new insights about the dynamics of biological systems.