Abstract:
|
Variable selection in ultra-high dimensional linear regression settings is challenging. Often a variable screening such as the sure independence screening is performed to significantly reduce the dimension before a refined selection method is applied. However, the assumption of high marginal correlation between the response and the important variables needed for the sure independence screening to preserve the true model with large probability rarely holds in practice. We develop the first ever Bayesian variable screening method, called BIS. The proposed method can successfully integrate prior knowledge, if any, on effect sizes, and the number of true variables, into the analysis. BIS iteratively includes potential variables with the highest posterior probability accounting for the already selected variables. The procedure is implemented by a fast Cholesky update method. We prove that BIS has the screening consistency property without the marginal correlation assumption. Simulation studies and real data examples are used to demonstrate the fine screening performance and the scalability of BIS relative to other screening methods.
|