Abstract:
|
Knockoff based approaches have become popular for variable selection to control the false discovery rate. To implement the knockoff algorithm, one needs to know the distribution of the covariates (X) in order to generate the model-X knockoffs. However, in practice, the distribution of the covariates is unknown. In this paper, we propose a new model-free approach to generate the model-X knockoffs. In particular, we use a Bayesian nonparametric approach, Dirichlet process mixtures to estimate the density of the covariates flexibly. Furthermore, to speed up the computations, we use a variational Bayes approach for estimation. We call the proposed approach Variational Bayes Model-X (VBM-X) knockoffs. Using VBM-X, knockoffs can be generated easily and efficiently. Moreover, this algorithm is applicable on mixed datasets (i.e., with both discrete and continuous covariates). We show the efficacy of the VBM-X knockoffs in various simulations and compare it with other existing methods.
|