Online Program

Return to main conference page
Saturday, October 21
Sat, Oct 21, 7:30 AM - 8:30 AM
Aventine Ballroom G
Continental Breakfast and Speed Poster 4 sponsored by Bank of America

Comparison of Machine learning algorithms for imbalanced multi-level classification of Alzheimer’s disease using MRI data (304085)

*Poulami Barman, Division of Biomedical Statistics and Informatics, Mayo Clinic, MN 

The growing complexity of big-data has led us to develop, more advanced data mining algorithms to cater to specific data structures. In this paper, we compare the performance of commonly used algorithms in an imbalanced (multiple minority groups) multi-level classification problem where 'n' is small. Additionally, we also evaluate hybrid methodologies. The predictors were brain MRI data (cortical thickness) in 86 Regions of Interest to predict three distinct Alzheimer’s disease subtypes in 121 clinical AD subjects: HpSp (11), Limbic (22), & Typical (88). We tested the following models: a) Lasso b) Elastic Net c) Classification and Regression Tree c) Random Forest d) Gradient Boosting Machines (GBM) e) Synthetic Minority Over-sampling Technique (SMOTE) with GBM e) AdaBoost f)SMOTE with AdaBoost. Five-fold cross validation showed that a hybrid two-level SMOTE-GBM outperformed all algorithms with an overall accuracy of 81% [80% (Typical), 82% (Limbic) and 85%(HpSp)]. Even though multi-class algorithms have been developed to outperform multiple two-class problems, hybrid or problem specific customized tools may be needed.