Online Program Home
My Program

Abstract Details

Activity Number: 285 - Probabilistic Record Linkage and Inference with Merged Data
Type: Topic Contributed
Date/Time: Tuesday, July 30, 2019 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Epidemiology
Abstract #301709
Title: Semiparametric Inference for Merged Data from Multiple Overlapping Sources
Author(s): Takumi Saegusa*
Companies: University of Maryland
Keywords: data integration; empirical process ; semiparametric model

We study semiparametric inference for merged data from multiple overlapping data sources. In public health data integration, studies to be combined have different target populations with overlaps. Also, subjects in a disease registry appear in other clinical studies as patients. A setting we consider is characterized by (1) duplication of the same units in multiple samples, (2) unidentified duplication across samples, (3) dependence due to finite population sampling. Applications include data synthesis of clinical trials, epidemiological studies, disease registries and health surveys. Main results are the extension of empirical process theory to biased and dependent samples with duplication. Specifically we develop the uniform law of large numbers and uniform central limit theorem with applications to general theorems for consistency, rates of convergence and asymptotic normality for infinite-dimensional M-estimators. Our method accounts for heterogeneity and bias in multiple data sets and guarantees generalizability of scientific findings from combined data. Our results are illustrated with simulation studies and a real data example using the Cox proportional hazards model.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program