Abstract:
|
In recent years there has been increasing interest in analyses that combine data from multiple sources to estimate some causal, predictive, or descriptive parameter of interest. Examples of such work involve transportability analyses that estimate causal effects in a target population by combining data from a completed randomized trial and a separately obtained sample from the target population; tailoring of prediction models to a target population in which outcomes cannot be ascertained and related work in covariate shift / domain adaptation; and causally interpretable meta-analysis and other notions of data-fusion. Using examples from ongoing applied work, we examine the delicate interplay between causal assumptions, study design, and sampling properties when learning by combining data from multiple sources. We argue that, though often ignored, study design and sampling properties critically impact the identification and estimation of the target parameters, and therefore need to be considered on par with causal assumptions and estimation methods, which have attracted most attention to date.
|