|
Activity Number:
|
215
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Monday, August 3, 2009 : 2:00 PM to 3:50 PM
|
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
| Abstract - #304539 |
|
Title:
|
Penalization Methods for Simultaneous Supervised Clustering and Variable Selection
|
|
Author(s):
|
Dhruv Sharma*+ and Howard D. Bondell and Hao (Helen) Zhang
|
|
Companies:
|
North Carolina State University and North Carolina State University and North Carolina State University
|
|
Address:
|
2005 Blackwood Drive, Raleigh, NC, 27612,
|
|
Keywords:
|
Variable selection ; Penalization ; Coefficient shrinkage ; Correlation ; Supervised clustering ; Oracle properties
|
|
Abstract:
|
Statistical procedures for variable selection have become integral elements in any analysis involving large data sets. Successful procedures are characterized by high predictive accuracy, yielding interpretable models while retaining computational efficiency. Penalized methods that perform coefficient shrinkage have been shown to be successful in many cases. Models that exhibit high multicollinearity are particularly challenging to tackle. We propose a penalization procedure that performs variable selection while clustering groups of correlated variables by setting their coefficients as equal. The properties of the procedure are studied in both regression and classification problems. The Oracle properties of this procedure are also studied.
|