Activity Number:
|
215
- Contributed Poster Presentations: Section on Statistical Learning and Data Science
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #313814
|
|
Title:
|
SuperMICE: Multiple Imputation by Chained SuperLearners
|
Author(s):
|
Aaron Shev* and Hannah Laqueur and Rose Kagawa
|
Companies:
|
University of California, Davis and University of California, Davis and University of California, Davis
|
Keywords:
|
Imputation;
SuperLearner;
Ensemble;
Stacking;
MICE
|
Abstract:
|
Multiple Imputation by Chained Equations (MICE) provides a flexible framework in which nearly any data generation scheme can be equipped for sampling imputed values. This flexibility also leaves MICE open to issues arising from misspecification. We propose an extension of MICE that uses SuperLearner, an ensemble predictive algorithm that weights predictions from several models, to improve upon the accuracy of MICE while simultaneously reducing chances for misspecification. In a simulation study, we evaluate the performance of MICE equipped with Super Learner as compared to MICE with linear regression imputation and MICE predictive mean matching. Data consisting of an outcome and two predictors are simulated under two situations that might result in a misspecified model. A percentage of data is set as either missing completely at random or missing at random. We estimate the regression coefficients using multiply imputed data sets and compare the performance across the three methods at five levels of missingness and two sample sizes. Our preliminary results showed SuperLearner can improve coverage probabilities especially in cases of high missningness and low sample sizes.
|
Authors who are presenting talks have a * after their name.