JSM 2015 Online Program

Online Program Home
My Program

Abstract Details

Activity Number: 312
Type: Contributed
Date/Time: Tuesday, August 11, 2015 : 8:30 AM to 10:20 AM
Sponsor: Section for Statistical Programmers and Analysts
Abstract #316737
Title: Variable Selection Methods for Big Data: A Comparative Study
Author(s): Jun Liu* and Xuejing Mao
Companies: and AT&T
Keywords: big data ; logistic regression ; variable selection

Variable selection is an important step in statistical analysis. When the number of potential predictors is small, this step is straightforward. But with more and more predicators available in today's environment, this step becomes more and more critical and complicated. Logistic regression has many applications in business area. One of the areas logistic regression is widely used is risk management, for example, to predict the likelihood that a customer will be delinquent. In this paper, we will compare the performance of three commonly used variable selection methods in logistic regression using a large data set. This dataset is typical "Big" data as the number of records , as well as the number of variables in this dataset are very large.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program

For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home