Abstract:
|
Feature selection is routinely required in many contemporary statistical modeling tasks. To tackle the problem, there has been a revival of interest in the forward stagewise estimation methodology, where the main idea is to build up a model by conducting a sequence of simple learning steps to gradually increase the model complexity. Under the framework of generalized estimation equations (GEE), we study stagewise estimation approaches that handle clustered data using a variable grouping structure. In practice however, important groups may contain irrelevant variables; the key is thus to select at a group and individual level. We first propose a bi-level stagewise estimating equations (BiSEE) approach, and establish its correspondence to the sparse group lasso. We also propose a hierarchical stagewise estimating equations (HiSEE) approach, in which each estimation step is executed as a hierarchical selection process. Simulation studies show improved model selection and predictive performance of BiSEE and HiSEE compared to existing approaches. A study with Connecticut teen hospitalization data further showcases the efficacy of these approaches.
|