Abstract:
|
The next-generation of sequencing technologies are providing data on whole-genome sequences and millions of measured variables that capture an individual's "genomic profile", including information on gene expression, proteins, and the interactions among these molecules. Probabilistic graphical models are often used to infer the structure of biological networks from such genome-wide data. In this talk, I will describe a combination of theoretical, algorithmic, and empirical issues that arise in such applications focusing on directed probabilistic graphical models. First, an important set of sufficient conditions for identifiability of cyclic directed models will be presented, with their heuristic implications for constructing network inference algorithms. Second, an asymptotically correct algorithm for recovering a Bayesian network that is designed for the structure of genomics data will be presented. Finally, the deeper, empirical issues that usually defy useful application of these types of graphical models for biological network inference will be discussed, with a few examples of specific cases where these challenges can be overcome.
|