Abstract:
|
Network data with node covariates are common in many fields. In principle, the two sources of information can be combined for community detection. However, most existing methods either lack statistical interpretation or make strong conditional independence assumptions between the network, node covariates, and the community membership. In this paper, we develop a general statistical framework to describe the relationship between the link structure, node covariates, and communities. Further, we propose two families of statistical models which are the most general under this framework with the least conditional independence assumptions between the three parts. Mild conditions for model identifiability are established, and variational EM algorithms are developed to estimate community memberships as well as model parameters. The proposed methods have been applied to both simulated and real networks, and the results are promising.
|