Abstract:
|
The social networks are a prominent source of data for researchers in economics, epidemiology, sociology and many other disciplines. While the social benefits of analyzing these data are significant, including supporting open data access and reproducibility, their release can be devastating to the privacy of individuals and organizations. In this talk, we give a brief overview of challenges associated with protecting such data, and the problem of releasing summary statistics of graphs needed to build statistical models for networks while preserving privacy of individual relations. We propose a simple yet effective randomized response mechanism to generate synthetic networks under ?-edge differential privacy. We combine ideas and methods from both the statistics and the computer sciences, by utilizing likelihood based inference for missing data and Markov chain Monte Carlo (MCMC) techniques to fit exponential-family random graph models (ERGMs) to the generated synthetic networks. We demonstrate the usefulness of the proposed techniques on real data examples.
|