Name: 2019 Joint Statistical Meetings
Start: 2019-07-27T07:00:00+00:00
End: 2019-08-01
Location: Colorado Convention Center

Activity Number:	38 - Advances in Variable Selection
Type:	Contributed
Date/Time:	Sunday, July 28, 2019 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistical Learning and Data Science
Abstract #307031	Presentation
Title:	Feature Selection in Large Data with Heteroscedastic Errors
Author(s):	Yiying Fan*
Companies:	Cleveland State University
Keywords:	Subsampling; homoscedastic; heteroscedastic; feature selection; big data; regression
Abstract:	Feature selection from big data in a regression analysis is always a challenge. One popular assumption to feature selection in large regression data is that the random errors have a homoscedastic variance. In this study, we present a Subsampling Winner Algorithm (SWA) for feature selection in large regression data, when the errors are heteroscedastic and the variances need to be estimated. The idea of SWA is analogous to the selection of national merit scholars, and is capable of handling linear regression data of any dimension in principle. Parametric and nonparametric methods are used to estimate the weights. We also compare our procedure with the benchmark procedures such as Elastic Net, SCAD, MCP and Random Forest.

Authors who are presenting talks have a * after their name.

JSM 2019 Online Program