Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 215 - Contributed Poster Presentations: Section on Statistical Learning and Data Science
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #313814
Title: SuperMICE: Multiple Imputation by Chained SuperLearners
Author(s): Aaron Shev* and Hannah Laqueur and Rose Kagawa
Companies: University of California, Davis and University of California, Davis and University of California, Davis
Keywords: Imputation; SuperLearner; Ensemble; Stacking; MICE

Multiple Imputation by Chained Equations (MICE) provides a flexible framework in which nearly any data generation scheme can be equipped for sampling imputed values. This flexibility also leaves MICE open to issues arising from misspecification. We propose an extension of MICE that uses SuperLearner, an ensemble predictive algorithm that weights predictions from several models, to improve upon the accuracy of MICE while simultaneously reducing chances for misspecification. In a simulation study, we evaluate the performance of MICE equipped with Super Learner as compared to MICE with linear regression imputation and MICE predictive mean matching. Data consisting of an outcome and two predictors are simulated under two situations that might result in a misspecified model. A percentage of data is set as either missing completely at random or missing at random. We estimate the regression coefficients using multiply imputed data sets and compare the performance across the three methods at five levels of missingness and two sample sizes. Our preliminary results showed SuperLearner can improve coverage probabilities especially in cases of high missningness and low sample sizes.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program