Abstract:
|
We develop a novel model-free variable selection procedure for ultrahigh dimensional data based on a recently proposed independence measure. Compared with sure independence screening methods that only consider marginal dependence between the response and each predictor, our approach inherits the advantages of the new measure and incorporates joint information additionally to achieve sufficient variable selection. As a result, our method is more capable of selecting all the truly active variables, especially those are marginally independent with the response and those involving interactions or nonlinear structures. In addition, the method can handle either continuous or discrete responses with mixed-type predictors. The sure screening property is established under mild conditions, and the superiority of our procedure over existing methods is demonstrated in various simulation studies and an application in real data.
|