Abstract:
|
Data fusion is a particularly challenging scenario in data integration in which the probability of observing complete data is zero for every subject. The goal is to make inference about a model regressing an outcome on covariates that come from two separate sources. The outcome of interest, Y, is collected in one dataset, and a set of variables, L, is collected in another. Both datasets collect a common set of variables V. Existing semiparametric methods for missing data have been extended to the setting of data fusion, but not for time-to-event outcomes. We propose a method for data fusion with a time-to-event outcome by applying a proportional hazards model and transforming the observed datasets to apply an equivalent Poisson model in order to derive the appropriate semiparametric estimating equations for data fusion. The class of semiparametric estimating equations includes a doubly robust (DR) equation which provides consistent parameter estimates if either the data source process or the distribution of unobserved covariates is correctly specified. We evaluate the performance of our proposed method and the DR property through a simulation study.
|