Keywords: graphical models, gene networks, Bayesian methods, experimental design, adaptive learning
Graphical models have previously been used to reconstruct gene networks from RNA sequencing data. However, since several graphs can often explain data equally well, not all causal relationships can be inferred from observational data alone. Instead, perturbation experiments, such as gene knock-outs, are needed. Because different perturbation experiments yield varying degrees of information about the causal structure of a network, it is advantageous to select perturbations that most efficiently narrow down the set of possible causal graphs. In particular, we wish to find the optimal sequence of experiments that will yield the greatest gain of information about a gene network. To this end, we employ a Bayesian approach and compute the reduction in posterior entropy that would result from a particular perturbation. We select perturbations using a novel criterion that quantifies the uncertainty in each gene’s set of downstream or descendant genes. This ability to adaptively select experiments, combined with recent advances in the precision of gene-knockout experiments, provides a promising avenue for reconstructing gene networks by iterating between experimentation and analysis. We compare our learning algorithm to alternative perturbation selection schemes via a simulation study.