Abstract:
|
Observational zero-inflated count data arise in a wide range of areas such as economics and biology. A common research question in these areas is to identify causal relationships by learning the structure of a sparse directed acyclic graph (DAG). While structure learning of DAGs has been an active research area, existing methods do not adequately account for excessive zeros and hence are not suitable for modeling zero-inflated count data. Moreover, in many scientific settings, it is often interesting to study differences in the causal networks for data collected from two experimental groups (control vs treatment). To explicitly account for zero-inflation and identify differential causal networks, we propose a novel Bayesian differential zero-inflated negative binomial DAG (DAG0) model. Our main theorem proves that the causal structure of DAG0 is fully identifiable from purely observational data, using a general proof technique applicable beyond our model. Bayesian inference based on parallel-tempered Markov chain Monte Carlo is developed to efficiently explore the multi-modal posterior landscape. We show the utility of DAG0 through extensive simulations and real data analysis.
|