Online Program

Return to main conference page

Thursday, January 11
Thu, Jan 11, 9:00 AM - 10:45 AM
Crystal Ballroom E
Data Integration

Practical Secure Analyses in Vertically Partitioned Biomedical Data (304247)

*Yuji Samizo, Penn State University 
Aleksandra Slavkovic, Penn State University 

Keywords: Statistical disclosure limitation, Distributed databases, Secure-multi party computation, Lasso regression, Data sharing

Integrating multiple databases that are distributed among different data owners can be beneficial in numerous contexts of biomedical research. But the actual sharing of data is often impeded by concerns about data confidentiality. A situation like this require tools that can produce correct results while preserving data privacy. In recent years, many "secure" protocols have been proposed to solve specific statistical problems such as linear regression and classification. However, factors such as the complexity of these protocols, inability to assess model fit, and the lack of a platform to handle necessary data exchange have all prevented them from actually being used in real-life situations. We present a practical approach to perform statistical analyses securely on data held separately by multiple parties, without actually combining the data. The main focus is on protocols in the vertically partitioned database setting and generalize linear models. Extensions to model-selection algorithms such as the Lasso will be introduced as well. Discussion on possible disclosure risks will be made so that users can decide on whether the approach is “secure” enough for their needs. We are cu