Abstract:
|
A method is introduced for variable selection and prediction in linear regression problems where the number of predictors can be much larger than the number of observations. The methodology involves minimizing a penalized Euclidean distance, where the penalty is the geometric mean of the ?1 and ?2 norms of regression coefficients. This particular formulation exhibits a grouping effect, which is useful for model selection in high?dimensional problems. Also, an important result is a model consistency theorem, which does not require an estimate of the noise standard deviation. An algorithm for estimation is described, which involves thresholding to obtain a sparse solution. Practical performances of variable selection and prediction are evaluated through simulation studies and the analysis of real datasets.
|