Activity Number:
|
311
- Does Missing Data Affect Outcomes Examined Using Nationally Representative Survey Databases? A Comparison of Traditional and Data Science Approaches
|
Type:
|
Invited
|
Date/Time:
|
Tuesday, August 9, 2022 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Survey Research Methods Section
|
Abstract #320595
|
|
Title:
|
Data-Driven Methods for Missing Data Imputation in Health Disparities Research
|
Author(s):
|
Yuhao Zhang* and Yuxiao Huang and Yan Ma
|
Companies:
|
George Washington University and George Washington University and George Washington University
|
Keywords:
|
missing data;
machine learning ;
class imbalance ;
health disparities;
data augmentation
|
Abstract:
|
Traditional multiple imputation (MI) methods are built on parametric imputation models. These models are often not flexible enough to capture complex relationships such as interactions and nonlinearities in high dimensional and large scale data settings. Unlike parametric models, machine learning techniques (MLTs) are model-free methods, and thus provide flexibility for missing data imputation. We propose novel imputation methods based on MLTs under the framework of MI. We further develop a data augmentation approach to addressing the issue of class imbalance for the imputation of patient race, a key indicator for health disparities research.
|
Authors who are presenting talks have a * after their name.