Abstract:
|
Electronic health records such as the Interagency Registry for Mechanically Assisted Circulatory Support (INTERMACS), which collects records on mechanical circulatory support devices (MCSDs), clinical events, and follow-up evaluations of patients with advanced heart failure, are becoming increasingly granular. To build the mortality models, we first identify pre-MCSD therapy variables that are highly predictive of the outcome of interest: post-operation patient survival. The presence of missing data introduces hurdles to variable selection, primarily bias. We explore several variable selection and clustering techniques on both numerical and categorical variables, and then compare the subsets of variables that are selected to investigate the effects of missing data in determining the predictors.
|