Abstract Details
Activity Number:
|
509
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 6, 2014 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
Abstract #312032
|
View Presentation
|
Title:
|
The Art of Balancing
|
Author(s):
|
Zhen Zhang*+ and Justin Croft and Kendell Churchwell
|
Companies:
|
C Spire Wireless and C Spire Wireless and C Spire Wireless
|
Keywords:
|
skewed ;
algorithm ;
modeling ;
classifier ;
class imbalance ;
target accuracy
|
Abstract:
|
Data mining technology has been widely used in strategic marketing to uncover actionable information for a wide spectrum of critical marketing decisions. A common problem in many data mining applications is that data is often skewed, and skewed data often leads to degenerated algorithms that assign most or all cases to the most common outcome. For the modeling projects that have extremely skewed targets, such as churn prediction or fraud detection, data balancing techniques applied prior to modeling process are crucial steps to ensure a useful model. As modeling cases are domain and algorithm sensitive, there is no one-size-fit-all solution for the right balancing strategy. In this paper we present empirical guidelines on balancing strategies for extremely skewed data with binary outcome. Best practices are suggested pertaining to decision trees, logistic regression algorithm and neural network models.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2014 program
|
2014 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Professional Development program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.