Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 245 - SLDS CSpeed 4
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318284
Title: A model-agnostic hypothesis test for community structure and homophily in networks
Author(s): Eric Yanchenko* and Srijan Sengupta
Companies: North Carolina State University and North Carolina State University
Keywords: bootstrap; community detection; hypothesis testing; networks; random graphs; homophily
Abstract:

Networks continue to be of great interest to statisticians, with an emphasis on community detection. Less work, however, has addressed this question: given some network, does it exhibit meaningful community structure? We propose to answer this question in a principled manner by framing it as a statistical hypothesis in terms of a formal and model-agnostic homophily metric. Homophily is a well-studied network property where intra-community edges are more likely than between-community edges. We use the homophily metric to identify and distinguish between three concepts: nominal, collateral, and intrinsic homophily. We propose a simple and interpretable test statistic leveraging this homophily parameter and formulate both asymptotic and bootstrap-based rejection thresholds. We prove its asymptotic properties and demonstrate it outperforms benchmark methods on both simulated and real world data. Furthermore, the proposed method yields rich, provocative insights on classic data sets; namely, that meany well-studied networks do not actually have intrinsic homophily.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program