Abstract:
|
In this talk, we develop a novel p-value free variable selection algorithm in high-dimensional settings which maintains high power while controlling the associated False Discovery Rate (FDR). This methodological work is motivated by the need to uncover the true sparsity pattern, buried in a high-dimensional data setting where the response of interest is quantitative and the set of potential covariates is of high to ultra-high dimensions, arising in genetic and imagining studies. A useful approach to assess the performance of a variable selection method is to check the associated FDR. Most of the state-of-the-art methods for controlling FDR rely on p-value, which depends on specific assumptions on the data distribution. Consequently, p-value based approaches for controlling FDR may result in sensitivity loss. In the spirit of Wasserman & Roeder (2009), we propose a ‘screening & cleaning’ strategy consisting of assigning importance scores to the predictors, followed by constructing an estimate of FDR. We study the theoretical properties of the proposed method, followed by simulations and a real data analysis of brain imaging data in the context of diffusion tensor imaging.
|