JSM 2016 Online Program

Activity Number:	7
Type:	Invited
Date/Time:	Sunday, July 31, 2016 : 2:00 PM to 3:50 PM
Sponsor:	Health Policy Statistics Section
Abstract #318475	View Presentation
Title:	Model Fitting with Distributed Data
Author(s):	Balasubramanian Narasimhan Narasimhan*
Companies:	Stanford University
Keywords:	distributed computing ; model fitting ; likelihood
Abstract:	Bringing together the information latent in distributed medical databases promises to personalize medical care by enabling reliable, stable modeling of outcomes with rich feature sets. However, there are barriers to aggregation of medical data, due to lack of standardization of ontologies, privacy concerns, proprietary attitudes toward data, and a reluctance to give up control over end use. Aggregation of data is not always necessary for model fitting. In models based on maximizing a likelihood, the computations can be distributed, with aggregation limited to the intermediate results of calculations on local data, rather than raw data. Distributed fitting is also possible for several other iterative algorithms. We present a set of software tools that allow the rapid assembly of a collaborative computational project, based on the flexible and extensible R statistical software and other open source packages, that can work across a heterogeneous collection of database environments, with full transparency to allow local officials concerned with privacy protections to validate the safety of the method.

Authors who are presenting talks have a * after their name.