Abstract:
|
Variable selection is commonly used to arrive at a parsimonious model. Oftentimes a selection rule that prescribes the permissible variable combinations in the final model is desirable due to the inherent structural constraints among the candidate variables. Penalized regression methods can integrate these restrictions ("selection rules") by assigning the covariates to different groups and then applying different penalties to the groups of variables. However, no general framework has yet been proposed to formalize selection rules and their application. In this work, we develop a mathematical language for constructing selection rules in variable selection, where the resulting combination of permissible sets of selected covariates, called a "selection dictionary", is formally defined. We show that all selection rules can be represented as a combination of operations on constructs, and these can be used to identify the related selection dictionary. One may then apply some criteria to select the best model. We also present a necessary and sufficient condition for a grouping structure used with overlapping group Lasso to carry out variable selection under an arbitrary selection rule.
|