Online Program Home
  My Program

Abstract Details

Activity Number: 424 - SPEED: Statistical Education
Type: Contributed
Date/Time: Tuesday, August 1, 2017 : 3:05 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #325080
Title: Using Data Mining to Identify At-Risk Freshmen
Author(s): Nora Galambos*
Companies: Stony Brook University
Keywords: data mining ; CHAID ; CART ; learning mangagement system ; K-fold cross validation

Data mining is used to develop decision tree models, deployed during orientation through week six of the semester, to identify low GPA freshmen at a top public research university. Our decision tree modeling has been successful in the early identification of high risk students and has demonstrated a strong association between learning management system (LMS) logins and GPA outcomes along with more traditional measures. At our institution less than 25% of first-time full-time freshmen with a GPA below 2.5 in their first term graduate within 4 years, with less than half graduating within 6 years. Identifying students before they earn their first low GPA will allow them to be assigned to interventions appropriate to their needs, before they are faced with probation or suspension. SAS Enterprise Miner is used to develop and compare CART, CHAID, gradient boosting, and linear regression models. The goal is to develop robust predictive models that are available the moment students arrive on campus in the fall. Customized dashboards enable users to segment, filter, and list students to assign them to the appropriate plans.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association