Name: 2020 Joint Statistical Meetings
Start: 2020-08-02T07:00:00+00:00
End: 2020-08-06

Online Program Home
My Program

All Times EDT

Abstract Details

Activity Number:	25 - Modern Techniques in Handling Missing Data with Challenging Data Structures Including Big and Small Data Files
Type:	Topic Contributed
Date/Time:	Monday, August 3, 2020 : 10:00 AM to 11:50 AM
Sponsor:	Survey Research Methods Section
Abstract #312906
Title:	Missing Data Analysis Using Machine Learning
Author(s):	Chao Xu* and Sixia Chen
Companies:	University of Oklahoma Health Science Center and University of Oklahoma Health Sciences Center
Keywords:	Missing data; Machine learning; Deep learning; High-dimensional data; Big data; Imputation
Abstract:	The advancement of data collection and storage technology produce big volume data for clinical and basic science research, such as the electronic health/medical records with hundreds and even more variables. As a commonly used data imputation technique, machine-learning methods are promising in dealing with complicated correlations in big data. However, their statistical properties are not well studied, such as the deep learning. It is urgent to have a practical guide for the application of machine learning methods on the missing data analysis. Therefore, we design a comprehensive simulation study of missing data analysis to evaluate the performance of classical statistical methods, high-dimensional model, classical machine-learning methods, and deep learning. In the simulation, we consider low- and high-dimensional data size, linear and non-linear correlations among variables. The imputation bias and variance of the different methods are compared. Our study will provide guidance for investigators wishing to use machine-learning methods for data imputation, and promote more machine-learning based application and theory study.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program

JSM 2020 Online Program

Abstract Details

American Statistical Association