Online Program Home
My Program

Abstract Details

Activity Number: 256 - Contributed Poster Presentations: Section on Statistical Learning and Data Science
Type: Contributed
Date/Time: Monday, July 29, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #304936
Title: Combining Machine Learning and Statistical Modeling to Identify Risk Factors of Hospital Mortality and Directionality for Patients with Acute Respiratory Distress Syndrome (ARDS)
Author(s): Meng Zhang* and Michael Qiu and Molly Stewart and Jamie Hirsch and Negin Hajizadeh
Companies: Feinstein Institute for Medical Research and Feinstein Institue for Medical Research and Feinstein Institue for Medical Research and Feinstein Institue for Medical Research and Feinstein Institue for Medical Research
Keywords: Machine Learning; Data-Driven; Hypothesis-Driven; Statistical Modelling; Variable Importance Ranking; Electronic Health Record
Abstract:

Mortality for severe ARDS patients is high. Identifying risk factors associated with hospital mortality and, the directionality of these associations may help inform new basic science and clinical studies. However, these studies were usually done through purely hypothesis-driven variable selection from a hypothesis-constrained dataset using traditional statistical methods. Electronic Health Records have large data that may lead to discovery of novel risk factors for ARDS mortality, but are unwieldy to analyze using these methods. We leveraged machine learning techniques to narrow candidate variables associated with mortality through variable importance ranking. These techniques included random forests, support vector machine, gradient boosting, Lasso and Ridge regression. Variables ranked top up to 25 percent on average were included in subsequent analysis using logistic regression. A total of 107 variables for 246 patients were extracted from EHR. Five risk factors were identified to be statistically significant associated with hospital mortality and the directionality was determined. This data-driven methodology allows for new discoveries from the entire EHR data for further research.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program