Abstract:
|
Increasingly in a wide variety of application areas, ranging from neuroscience to the tech industry, data are collected on multiple dependent network-structured data, with enormous numbers of nodes in each network. Most approaches for statistical analysis of huge network data focus on a single network, and relatively simple statistical model and/or analysis task. For example, the entire focus may be on community detection. In this talk, we describe general divide-and-conquer MCMC algorithms for fitting broad classes of Bayesian hierarchical models to network data. The proposed algorithms are based on carefully designed subsample-based likelihood approximations, which enable MCMC to be conducted in an embarassingly parallel manner for different chunks of the network, with the results then combined in a simple communication efficient manner. The proposed likelihood approximations are related to stratified sampling approaches in the survey sampling and epidemiology literature. The methods are applied to data on brain connectomes.
|