Online Program Home
  My Program

Abstract Details

Activity Number: 303 - Big Data
Type: Contributed
Date/Time: Tuesday, August 1, 2017 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Computing
Abstract #324497
Title: A Divide-And-Conquer Approach for Survival Analysis in Big Data
Author(s): Shou-En Lu* and Jerry Q. Cheng and Minge Xie
Companies: Rutgers Department of Biostatistics and Rutgers Cardiovascular Institute of New Jersey and Rutgers University
Keywords: Divide and conquer ; Proportional hazards model ; Big data

Divide and conquer (D&C) is a smart and efficient approach to analyzing and making inference for big data. Its principle is to divide a big dataset into K subsets, then process each sub-dataset separately and combine these individual solutions to form a final solution to the original full data. In this paper, we propose a D&C approach for analysis using Cox proportional hazards model. Specifically, we consider to randomly divide the data into K subsets and propose a weighted method to combine the K partial maximum likelihood estimators (PMLE), each from an individual sub-dataset. Under some mild conditions, we show that the proposed final estimator is asymptotically equivalent to the PMLE from the full data as if it is analyzed all at once. We next extend our approach to the variable selection problem and propose an estimator that combines the K maximum penalized partial likelihood estimators, each obtained from an individual sub-dataset. Statistical properties of the resultant estimators are developed. Performance of the proposed methods, including savings in computation time, is investigated using simulation studies. A data example is provided to illustrate the proposed methods.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association