Abstract:
|
Group variable selection is necessary when some of the potential explanatory variables are correlated. In genetics, there are thousands of single nucleotide polymorphism (SNP) variants to be validated for the genotype association with survival outcomes. Controlling the familywise error rate (FWER) is too stringent in this case; hence we will relax the group selection criteria by controlling the false discovery rate (FDR). We use the Generative Adversarial Networks to generate a model-free knockoff filter for group selection while controlling FDR. Simulations are generated to demonstrate that the proposed model-free knockoff filter performs comparatively robust group selection. For the real data analysis, we implement the proposed knockoff filter to the 1000 Genome project data to select SNPs used for prediction of genotype association between human leukocyte antigen (HLA) haplotypes and the SNPs among HLA class alleles.
|