Online Program Home
My Program

Abstract Details

Activity Number: 181 - SPEED: Statistical Learning and Data Science Speed Session 1, Part 2
Type: Contributed
Date/Time: Monday, July 29, 2019 : 10:30 AM to 11:15 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #307527
Title: Multiple Imputation Versus Machine Learning: Predictive Models to Facilitate Analyzes of Association Between Contemporaneous Medicaid/CHIP Enrollment Status and Health Measures
Author(s): Jennifer Rammon* and Yulei He and Jennifer Parker
Companies: National Center for Health Statistics/CDC and CDC and CDC/NCHS/OAE/SPB
Keywords: Multiple Imputation; Machine Learning; Medicaid/CHIP; NHANES; linked data; Prediction Models

Data from the 1999-2012 National Health and Nutrition Examination Survey have been linked to the Center for Medicare and Medicaid Services’ Medicaid Enrollment and Claims Files by the National Center for Health Statistics’ Data Linkage Program. Assessments of the Medicaid and CHIP program rely on clear evaluations of the health status of Medicaid and CHIP children. While data from 1999-2012 is informative, it is also of interest to evaluate contemporaneous data. However, the linkage process takes time and delays between when NHANES data are released and when they are linked to the CMS Medicaid files are unavoidable. Previously, we used the 2005-2012 linked data files to examine the feasibility of using multiple imputation (MI) methods to predict the Medicaid enrollment status of children for survey years that are not linked to the CMS Medicaid files. Results indicated that MI methods showed little improvement over survey response, which is known to underestimate the number of children enrolled in Medicaid. Here, machine learning methods are tested against the MI approach to evaluate which method performs best in terms of prediction accuracy, estimate bias, and estimate precision.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program