Online Program Home
My Program

Abstract Details

Activity Number: 38 - Advances in Variable Selection
Type: Contributed
Date/Time: Sunday, July 28, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #307031 Presentation
Title: Feature Selection in Large Data with Heteroscedastic Errors
Author(s): Yiying Fan*
Companies: Cleveland State University
Keywords: Subsampling; homoscedastic; heteroscedastic; feature selection; big data; regression
Abstract:

Feature selection from big data in a regression analysis is always a challenge. One popular assumption to feature selection in large regression data is that the random errors have a homoscedastic variance. In this study, we present a Subsampling Winner Algorithm (SWA) for feature selection in large regression data, when the errors are heteroscedastic and the variances need to be estimated. The idea of SWA is analogous to the selection of national merit scholars, and is capable of handling linear regression data of any dimension in principle. Parametric and nonparametric methods are used to estimate the weights. We also compare our procedure with the benchmark procedures such as Elastic Net, SCAD, MCP and Random Forest.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program