Abstract:
|
In many contemporary scientific fields, regression with ultrahigh-dimensional covariates (p >> n) involves sparse signals, i.e. only a small share of the original p covariates is truly associated with the response. Due to issues in ultrahigh dimension, such as computational burden, prior to any standard statistical procedure, we propose to substantially reduce the number of covariates through a model-free screening procedure called Covariate Information Screening (CIS). CIS uses a Fisher information-based marginal utility that we call Covariate Information Number. This screening step is designed to minimize false negatives eliminating only redundant covariates. Simulation results demonstrate competitive performance of CIS compared to popular screening procedures such as Sure Independence Screening (Fan and Lv, 2008, JRSS-B) and Sure Independent Ranking and Screening (Zhu et al., 2011, JASA). The iterative version of CIS (ICIS) improves upon the performance of CIS. We are currently investigating the theoretical properties of CIS.
|