CE_31T Wed, 8/2/2017, 8:00 AM - 9:45 AM H-Holiday Ballroom 3
Introduction to Data Mining with CART Classification and Regression Trees (ADDED FEE) — Professional Development Computer Technology Workshop
ASA , Salford Systems
This Tutorial is intended for the applied statistician wanting to understand and apply the CART Classification and Regression Trees methodology. The emphasis will be on practical data analysis and data mining involving classification and regression. All concepts will be illustrated using real-world step-by-step examples. The course will begin with an intuitive introduction to tree-structured analysis: what it is, why it works, why it is non-parametric and model-free, advantages in handling of all types of data including missing values and categorical. Working through examples, we will review how to read the CART Tree output and how to set up basic analyses. This session will include performance evaluation of CART trees and will cover ways to search for possible improvements of the results. Once a basic working knowledge of CART has been mastered, the tutorial will focus on critical details essential for advanced CART applications including: choice of splitting criteria, choosing the best split, using prior probabilities to shape results, refining results with differential misclassification costs, the meaning of cross validation, tree growing and tree pruning. The course will conclude with some discussion of the comparative performance of CART versus other computer-intensive methods such as artificial neural networks and statistician-generated parametric models.
Instructor(s): Dan Steinberg, Salford Systems, Mikhail Golovnya, Salford Systems, Charles Harrison, Salford Systems
