Abstract:
|
Despite the increasing prominence of network samples arising in nature, the tools available for analyzing network samples remain limited. In this talk, we consider the problem of network representation learning for network samples. We first introduce a technique, Principal Component Analysis for Networks (PCAN), that identifies statistically meaningful low-dimensional representations of a network sample via subgraph count statistics. Despite its utility, the PCAN algorithm is limited by its computational speed in large networks. To address this limitation, we introduce a fast sampling-based procedure, sPCAN, that not only is significantly more efficient than its counterpart, but also enjoys the same advantages of interpretability. We investigate a large-sample analysis of the methods when the sample of networks analyzed is a collection of kernel-based random graphs. We show that 1) the embeddings identified by the sPCAN and PCAN methods are asymptotically equivalent, and 2) the embeddings of sPCAN enjoy a central limit theorem. The PCAN and sPCAN set the stage for a new line of research in interpretable network representation learning.
|