Abstract:
|
Detection of anomalous patterns in web-scale graphs, such as online social networks, is a critical but challenging problem. While existing techniques such as burst detection, topic modeling, clustering, or the non-parametric heterogeneous graph scan can be used to find patterns in online social network data, these techniques do not scale to very large graphs without approximations resulting in substantial loss of accuracy. Our solution, the web-scale subset scan (WSSS), exploits the natural core-periphery structure inherent in many web-scale graphs to detect patterns accurately and efficiently. WSSS combines a novel "subcore-tree" graph decomposition, which separates the graph into a set of dense communities and a low-treewidth component, with an efficient non-parametric subset scan, which exploits the additive linear-time subset scanning property to enable efficient search over connected subgraphs for both the core and tree components. We compare WSSS to existing methods on multiple web-scale datasets and demonstrate improvements in computational efficiency, scalability, and the accuracy of the detected subgraphs. This work was partially supported by NSF grant IIS-0953330.
|