Abstract:
|
Multivariate failure time data are frequently analyzed using the marginal proportional hazards (PH) model and the frailty model approaches. When the sample size is extraordinarily large, using either approach could face computational challenges. In this paper, we propose a divide-and-conquer (DC) approach to analyzing multivariate failure time data using marginal PH model and frailty model approaches. Specifically, we randomly divide the full data into S subsets and propose a weighted method to combine the S estimators, each from an individual sub-dataset. Under mild conditions, we show that the combined estimators are asymptotically equivalent to the estimators obtained from the full data as if the data were analyzed all at once. In addition, we propose a confidence distribution approach to perform variable selection. Theoretical properties, such as consistency, oracle property, and asymptotic equivalence between full data analysis and the DC approach are studied. Performance of the proposed methods, including savings in computation time, is investigated using simulation studies. A data example is provided to illustrate the proposed methods.
|