Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 144 - Methods for Missing and/or Misclassified Data
Type: Contributed
Date/Time: Monday, August 8, 2022 : 10:30 AM to 12:20 PM
Sponsor: Biometrics Section
Abstract #323541
Title: Missing Data Interpolation Among Cohorts with Disparate Covariate Information in the ECHO-Wide Cohort
Author(s): Amii Kress and Ekaterina Smirnova and Yongqi Zhong and Xuejuan Ning and Rasha Alsaadawi* and Jordan Kuiper and Mingyu Zhang and Lisa Jacobson and Bryan Lau
Companies: Johns Hopkins University Bloomberg School of Public Health and Virginia Commonwealth University and Johns Hopkins Bloomberg School of Public Health and Johns Hopkins University Bloomberg School of Public Health and Virginia Commonwealth University and Johns Hopkins University Bloomberg School of Public Health and Johns Hopkins University Bloomberg School of Public Health and Johns Hopkins University Bloomberg School of Public Health and Johns Hopkins Bloomberg School of Public Health
Keywords: Missing Data; Meta-analysis; Imputation; Random Forests; Hierarchical Clustering
Abstract:

The Environmental Influences on Child Health Outcomes (ECHO)-wide cohort is a nation-wide consortium study that combines 69 pediatric cohorts to study environmental exposure effects on child health outcomes. The cohorts differ in study design, primary outcomes of interest, and types of data collected. These differences cause methodological issues when harmonizing extant data for meta-analyses. Traditional approaches for such cohort-level missing data result in potential residual confounding bias or reduce the sample size and precision. Exiting methods assume the confounder of interest is distributed comparably across studies, which is often violated. We developed a machine learning approach, combining random forests to create distances between observations within and across cohorts and hierarchical clustering, to identify sub-groups of cohorts with similar confounder distributions. The performance was tested using simulations, which indicated the algorithm correctly identifies the cohort subgroups. Results were compared using complete case, single imputation, and MICE within cohort subgroups to quantify bias and change in precision with the different missing data approaches.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program