Abstract:
|
The Environmental Influences on Child Health Outcomes (ECHO)-wide cohort is a nation-wide consortium study that combines 69 pediatric cohorts to study environmental exposure effects on child health outcomes. The cohorts differ in study design, primary outcomes of interest, and types of data collected. These differences cause methodological issues when harmonizing extant data for meta-analyses. Traditional approaches for such cohort-level missing data result in potential residual confounding bias or reduce the sample size and precision. Exiting methods assume the confounder of interest is distributed comparably across studies, which is often violated. We developed a machine learning approach, combining random forests to create distances between observations within and across cohorts and hierarchical clustering, to identify sub-groups of cohorts with similar confounder distributions. The performance was tested using simulations, which indicated the algorithm correctly identifies the cohort subgroups. Results were compared using complete case, single imputation, and MICE within cohort subgroups to quantify bias and change in precision with the different missing data approaches.
|