Online Program Home
My Program

Abstract Details

Activity Number: 341 - SPEED: Classification and Data Science
Type: Contributed
Date/Time: Tuesday, July 31, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #329580 Presentation
Title: A Machine Learning (ML) Approach to Prognostic and Predictive Covariate Identification for Subgroup Analysis and Hypotheses Generation
Author(s): David A James*
Companies: Novartis
Keywords: machine learning; random survival forests; time-to-event; cognostics

We illustrate the use of ML techniques to explore association between patient characteristics and various time-to-event clinical endpoints among patients in large morbidity & mortality cardiovascular studies. Specifically, we set out to (1) identify prognostic and predictive features to characterize potential subgroups of patients with higher endpoint risks; (2) to discriminate differential treatment effects; and (3) to generate hypotheses for further investigations. Exploratory analyses include the use of "cognostics" (Tukey 1965, Wilkinson, Anand, and Grossman 2005, Guha et al 2010) to rank-order large number of visual displays to efficiently explore relatively large number of candidate features and exploratory hypotheses; various recursive partitioning trees, including survival trees (Breiman et al 1984, LeBlanc and Crowley 1992, Therneau and Atkins 1997), random survival trees (Breiman 2001, Ishwaran et al 2008, 2014, 2016), and model-based partitioning trees (Seibold, Zeiles, and Hothorn 2014). We conclude with lessons learned and highlight strengths and weaknesses of these ML techniques vis-a-vis traditional statistical techniques for EDA of M&M trial data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program