Online Program

Return to main conference page
Friday, May 18
Survey Data
Fri, May 18, 3:00 PM - 3:45 PM
Regency Ballroom B
 

Secure Distributed Computational Processing for Industry Statistical Data (304658)

David Archer, Galois Inc. 
*Cavan Paul Capps, U.S. Census Bureau 
James Hinkley, U.S. Census Bureau 

Keywords: secure multi-party computing, privacy, big data collection

Secure computational distributed technology is being developed by DARPA and the Census Bureau to process real time big data encrypted from businesses in such a way that only the providing business has the means to decrypt and read the data. Census cannot read the data, but will have the ability generate various statistics about the state of the entire industry reliably. This is all done using open source software and the Intel SGX secure enclave chip. Our goal to greatly improve industry-wide statistics and their usefulness to industry while significantly reducing respondent burden.

A Shipment Survey has been selected as the use case for the pilot using simulated data from that survey. Machine learning is used to standardize product codes before the data is encrypted for transmission, enabling consistent definitions across a given industry. When received, the data is parsed from standard electronic transaction records. It is tabulated from multiple company donors at scale using the cloud, and formally confidentially aggregated results released. We have also explored linking data from different data donors which simulates the linkage needed to track packages from shippers (such as Walmart) to carriers (such as UPS) in order to provide better transportation data and information on supply chains. Simple business intelligence queries can be done on the final detailed formally private aggregates to inform business decisions in the shipping and logistics industry as well as informing national economic policy