Online Program Home
  My Program

Abstract Details

Activity Number: 442 - Model-Based Statistical Analysis of Network Data
Type: Invited
Date/Time: Wednesday, August 2, 2017 : 8:30 AM to 10:20 AM
Sponsor: IMS
Abstract #325511
Title: Estimating the number of connected components of large graphs based on subgraph sampling
Author(s): Yihong Wu*
Companies: Yale University
Keywords:
Abstract:

Learning properties of large graphs from samples is an important problem in statistical network analysis, dating back to the early work of Goodman and Frank. We revisit the problem formulated by Frank (1978) of estimating the numbers of connected components in a graph of $N$ vertices based on the subgraph sampling model, where we observe the subgraph induced by $n$ vertices drawn uniformly at random. The key question is whether it is possible to achieve accurate estimation, i.e., vanishing normalized mean-square error, by sampling a vanishing fraction of the vertices. We show that this is impossible if the graph contains high-degree vertices or long induced cycles; otherwise, it is possible by accessing only sublinear number of samples. Optimal sample complexity bounds are obtained for several classes of graphs including forests, cliques and chordal graphs.

The methodology relies on topological identities of graph homomorphism numbers, which, in turn, also play a key role proving minimax lower bounds based on constructing random instances of graphs with matching structures of small subgraphs. If time permits, we will also discuss results for the neighborhood sampling model.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association