Online Program Home
My Program

Abstract Details

Activity Number: 544
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #320951
Title: Using Inverse Probability of Censoring Weighted Bagging to Adapt Machine-Learning Techniques to Censored Data
Author(s): Ales Kotalik* and Julian Wolfson and David Vock and Gediminas Adomavicius and Sunayan Bandyopadhyay
Companies: University of Minnesota and University of Minnesota and University of Minnesota School of Public Health and University of Minnesota and University of Minnesota
Keywords: censored data ; bagging ; ipcw

There is currently great interest in developing tools to predict an individual's future risk of experiencing an adverse health event by utilizing patients' electronic health data (EHD). However, the nature of EHD does not guarantee that all subjects will be tracked for the entire timeframe over which we want to make predictions, and hence it may be uncertain whether or not the event occurred within that timeframe. Given the size and complexity of EHD, machine learning (ML) techniques are an appealing alternative to less flexible time-to-event regression methods, but most ML methods assume outcomes to be fully observed and therefore become biased when dealing with censored outcomes. We propose a universal and easy to implement technique that allows any ML technique to handle censored data by averaging predictions across a set of weighted bootstrap samples. The bootstrap sampling weights are computed using inverse probability of censoring weighting (IPCW). We demonstrate this method using EHD from a large Midwestern health insurance company where over 50% of the observations are censored. We employ several ML methods to predict cardiovascular risk in these data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association