Keywords: secure multi-party computing, privacy, big data collection
Secure computational distributed technology is being developed by DARPA and the Census Bureau to process real time big data encrypted from businesses in such a way that only the providing business has the means to decrypt and read the data. Census cannot read the data, but will have the ability generate various statistics about the state of the entire industry reliably. This is all done using open source software and the Intel SGX secure enclave chip. Our goal to greatly improve industry-wide statistics and their usefulness to industry while significantly reducing respondent burden.
A Shipment Survey has been selected as the use case for the pilot using simulated data from that survey. Machine learning is used to standardize product codes before the data is encrypted for transmission, enabling consistent definitions across a given industry. When received, the data is parsed from standard electronic transaction records. It is tabulated from multiple company donors at scale using the cloud, and formally confidentially aggregated results released. We have also explored linking data from different data donors which simulates the linkage needed to track packages from shippers (such as Walmart) to carriers (such as UPS) in order to provide better transportation data and information on supply chains. Simple business intelligence queries can be done on the final detailed formally private aggregates to inform business decisions in the shipping and logistics industry as well as informing national economic policy