Abstract:
|
We propose Ranking-Based Subset Selection (RBSS), a new technique aiming to identify variables affecting the response in high-dimensional data. The introduced RBSS algorithm uses subsampling to identify the set of covariates which non-spuriously appears at the top of a chosen variable ranking. We study the conditions under which such set is unique and show that it can be successfully recovered from the data by our procedure. Unlike the majority of the existing high-dimensional variable selection techniques, RBSS does not depend on any thresholding or regularity parameters. Moreover, RBSS does not require any model restrictions on the relationship between the response and covariates, it is therefore widely applicable, both in a parametric and non-parametric context. We illustrate its good practical performance in a comparative simulation study and using two data examples. The RBSS algorithm is implemented in the publicly available R package rbss.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.