JSM 2016 Online Program

Activity Number:	234
Type:	Topic Contributed
Date/Time:	Monday, August 1, 2016 : 2:00 PM to 3:50 PM
Sponsor:	International Statistical Institute
Abstract #320486
Title:	Causal Inference from Big Data: Theoretical Foundations and the Data-Fusion Problem
Author(s):	Elias Bareinboim*
Companies:	Purdue University
Keywords:	Data-fusion ; Transportability ; External Validity ; Causal Inference ; Meta-analysis ; Generalizability
Abstract:	In this paper, we summarize some of the latest results in the field of causal inference that are related to big data. In particular, we address the problem of data-fusion -- piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) so as to obtain valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, since the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. We here present a general, non-parametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data-fusion in causal inference tasks.

Authors who are presenting talks have a * after their name.