Conference Program

Return to main conference page

All Times ET

Thursday, June 9
Practice and Applications
Data Science in Clinical Contexts
Thu, Jun 9, 10:30 AM - 12:00 PM
Butler
 

Investigating Racial Disparities in Assisted Reproductive Technology Utilization and Outcomes: A Case Report on a Complex Missing Data Problem (310093)

*Katharine Correia, Amherst College 
Katherine Kraschel, Yale Law School 
David Seifer, Yale School of Medicine 

Keywords: multiple imputation by chained equations, missing data, clustered data, interaction, racial disparities

Missing data is inevitable when working with large, observational healthcare data. Multiple imputation by chained equations (MICE) is a popular method for dealing with missing data, but there are still many active research areas around optimal choices when implementing MICE, particularly with multilevel data and/or when interaction effects are of scientific interest. In this application, we present a study aimed at investigating racial disparities in assisted reproductive technology (ART) utilization and outcomes and whether health insurance mandates requiring coverage for ART treatment can reduce disparities. A national database with over 100,000 ART cycles between 2016-2019 is analyzed; 35% of cycles are missing patient race and ethnicity. Cycles are clustered by clinic state, and the percent of cycles missing race and ethnicity varies greatly across states from <5% to >95%. Since an interaction with insurance mandate is of primary interest in the analytic model, this interaction should be allowed for in the imputation model. But with insurance mandate a fixed value within state, both mandate and state cannot be included as predictors in a single model. We explore different approaches to implementing MICE in this context, including imputing separately by mandate, passively imputing the interaction, and including state but not mandate in the imputation model. Each of these three approaches is fit using three methods: the default models in the mice package in R, classification and regression trees, and random forest. The approaches yield the same general conclusion: the odds of live birth are significantly lower for non-Hispanic Black and non-Hispanic Asian women compared to non-Hispanic white women, and there is no significant interaction between insurance mandate and patient race and ethnicity. Simulation studies to better understand the operating characteristics of these approaches in similar contexts with clustering and interaction are in progress.