Abstract:
|
Most existing methods of variable selection in partially linear models (PLM) with ultrahigh dimensional covariates are based on the partial residual, which involves a two-step estimation procedure. While the estimation error produced in the first step may have some impact, multicollinearity among covariates adds challenges in the model selection procedure. In this paper, we propose a new Bayesian variable selection approach for PLM, which addresses the two issues as (1) it is a one-step method and (2) it outperforms existing ones in the setting of highly correlated predictors. Distinguished from existing ones, our proposed procedure uses the difference-based method to reduce the impact from the estimation of the nonparametric component, and incorporates the Bayesian subset modeling with diffusing prior (BAsub-DP) to shrink the parameters in the linear component. The estimation is implemented by Gibbs sampling, and we prove that the posterior probability of the true model being selected converges to one asymptotically. Simulation studies support the theory and the efficiency of our method as compared to existing ones, followed by an application in a study of supermarket data.
|