Abstract:
|
Over 500K scientific articles have been published since 1999 with the word “network” in the title. And the vast majority of these report network summary statistics of one type or another. However, these numbers are rarely accompanied by any quantification of uncertainty. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Perhaps surprisingly, there is little in the way of formal statistical methodology for this problem. I will summarize some of our recent results in this area, from a handful of related projects, all of which assume a type of 'signal plus noise' model. Focusing on the particular case of estimating the density of low-order subgraphs, we show that while estimation (without stronger model assumptions) is impossible with just one network observation, it becomes possible with as little as two or three replicates. We then develop a general theory for method-of-moments estimation of subgraph densities and functions thereof, accompanied by a novel bootstrap algorithm for uncertainty quantification.
|