Abstract:

We investigate the problem of conditional dependence graph estimation when several pairs of variables are never observed jointly, for which even the simplest metric of covariability, the sample covariance, is unavailable. This problem arises, for instance, in calcium imaging where the activities of a large population of neurons are typically observed by recording from smaller subsets of cells at once. In the Gaussian graphical model setting, the unavailability of parts of the covariance matrix translates into the unidentifiability of the precision matrix, which specifies the graph, unless additional assumptions are made. We call this problem "graph quilting" problem. We demonstrate that, under mild conditions, it is possible to correctly identify not only the edges connecting the observed pairs of nodes, but also a superset of those connecting the variables that are never observed jointly. We propose an L1 regularized graph estimator based on a partially observed sample covariance matrix and derive its rates of convergence in highdimensions. We finally present a simulation study and the analysis of calcium imaging data from mouse visual cortex.
