Abstract:
|
Consider a very large graph---say, the link graph of a large social network. Now invent a randomized algorithm that extracts a smaller subgraph. If we use the subgraph as sample data and perform statistical analysis on this sample, what can we learn about the underlying network? Clearly, that should depend on the algorithm. I will describe how sampling algorithms can be related to distributional symmetries. That does not seem to be possible for every algorithm, but whenever it is, we can obtain surprisingly strong results on the sampler output---laws of large numbers, central limit theorems, concentration inequalities---directly from these symmetry properties.
|