Abstract:
|
We introduce the PAPER (Preferential Attachment Plus Erd?s--Rényi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree and additional Erd?s--Rényi (ER) random edges. The PA tree component captures the fact that real world networks often have an underlying growth/recruitment process where vertices and edges are added sequentially, while the ER component can be viewed as random noise. Given only a snapshot of the final network G, we study the problem of building confidence sets for the early history, in particular the root node, of the unobserved growth process; the root node can be patient zero in a disease infection network or the source of fake news in a social network. We propose inference algorithms based on Gibbs sampling that scale to networks with millions of nodes and give theoretical analysis showing that the size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities; we use these models to provide a new approach to community detection.
|