Abstract:
|
We consider variable selection problems with correlated variables structure for high dimensional data. When predictor variables are correlated, many traditional veriable seleciotn methods show weak performance. To increase variable selection accuracy under such situation, group variable selection methods are widely used. However, the group structure among predictors usually is not known, especially for high dimensional data. Thus, we propose a cluster group variable selection method. The method first discovers the group structure considering the relationship among predictor variables, then conduct group variable selection methods. In addition, algorithm to analyze datasets with much larger number of predictors and observations is introduced. We compared the performance of proposed method with different group variable selection methods to show the advantages.
|