Abstract:
|
Data sets with large number of observations and large number of candidate predictors have created lots of challenges for statistical practitioners to produce meaningful predictions of future responses of interest. Ideally a good model would discard "noise" among the candidate variables and evaluate how strong these real "signal" variables relates with the response. Efficient variable selection tools are not only desired to pick up these important variables, they would also help improve the prediction of future responses by focusing on the relationship between the response and those meaningful "signal" variables, while avoiding an over-fitted model with "noisy" non-signal variables. In the present study, we aim at comparing several variable selection methods, including the frequentist LASSO, Bayesian LASSO and Bayesian Shrinkage Priors approach. A simulation study is conducted to compare these different variable selection methods and hence provide guidance of the choices of different variable selection tools under various sample sizes/data structure.
|