Online Program

Friday, May 18
Data Science
Statistical Analytics for Data Science
Fri, May 18, 1:30 PM - 3:00 PM
Grand Ballroom G

Privacy Analytics via Aggregate Data: Trade-off between Statistical Efficiency and Privacy (304540)

*Anand N. Vidyashankar, George Mason University 

Keywords: Privacy analytics, Statistical efficiency, Aggregation, Symbolic data

Healthcare data are often subjected to regulatory and contractual restrictions and protecting the confidentiality of the patient population is mandated by the law. It is often the case that aggregation is a method of choice for de-identifying patient records which in turn leads to histogram data. Statistical analyses of histogram data is complicated due to different methods of aggregation employed by different statistical agencies. In this presentation, we describe a statistically rigorous approach for the analyses of histogram data and bring out the trade-o_ in statistical efficiency and privacy. For this reason, we provide multiple notions of privacy and aggregation strategies that maintain privacy for a pre-determined statistical efficiency. Extensions of this idea to other symbolic data will also be described.