Abstract:
|
Case-cohort (CCH) design is an efficient way to analyze survival data for a cohort with a low event rate. The design consists of a randomly selected subcohort augmented with all of the incident cases so that the covariate data such as high-throughput gene expression data are only needed for the cases and subcohort controls, which can dramatically reduce the cost of genomic experiments. However, the most popular permutation based approaches for estimating false discovery rate (FDR) cannot be directly applied to CCH design because of the sampling. Given that the survival data are available to the full cohort, we have developed a procedure using missing data imputation and permutation to estimate the FDR. To examine performance of the proposed procedure, we used a leukemia dataset as the full cohort and drew a great number of CCH samples with different sampling fractions. We applied our approach to each of the CCH samples and compared the result to that from the full cohort analysis. There was an excellent agreement between the set of genes identified using our approach under the CCH design and that identified using the full cohort, even when the sampling fraction is as low as 0.2.
|