Abstract:
|
We study a latent position network model that generalizes random dot product graphs. We show that for both the normalized Laplacian and the adjacency matrix, the vector representations of nodes obtained by spectral embedding, using the largest eigenvalues by magnitude, provide strongly consistent latent position estimates with asymptotically Gaussian error, up to indefinite orthogonal transformation. The mixed membership and standard stochastic block models constitute special cases where the latent positions live respectively inside or on the vertices of a simplex, crucially, without assuming the underlying block connectivity probability matrix is positive-definite. Estimation via spectral embedding can therefore be achieved by respectively estimating this simplicial support, or fitting a Gaussian mixture model. In the latter case, the use of K-means (with Euclidean distance), as has been previously recommended, is suboptimal and for identifiability reasons unsound. Empirical improvements in link prediction and the potential to uncover latent structure are demonstrated in a cyber-security example.
|