Abstract:
|
We introduce a novel approach of Bayesian variable selection in the normal linear regression model with high dimensional data with both theoretical justification and fast parallel computation. An explicit posterior probability for including a covariate is obtained. The method is sequential but not order dependent, one deals with each covariate one by one, and a spike and slab prior is only assigned to the coefficient under investigation. We adopt the well-known spike and slab Gaussian priors with a sample size dependent variance, which achieves strong selection consistency for marginal posterior probabilities even when the number of covariates grows almost exponentially with sample size. We obtain the same results via the direct calculation of posterior probabilities, compared to a stochastic search over the entire model space. Our procedure also controls the False Discovery Rate under finite sample using posterior predictive p-value, loss functions and the approach of Benjamini and Hochberg.
|