Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 205 - Applications of Machine Learning
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #313037
Title: Comparing Machine Learning and Penalized Regression for Predicting Diabetic Kidney Disease Progression: Evidence from the Chronic Renal Insufficiency Cohort (CRIC) Study
Author(s): Jing Zhang* and Tobias Fuhrer and Brian Kwan and Daniel Montemayor and Kumar Sharma and Loki Natarajan
Companies: Moores Cancer Center, University of California, San Diego and Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland and University of California, San Diego and Department of Medicine, University of Texas Health Science Center at San Antonio and Department of Medicine, University of Texas Health Science Center at San Antonio and University of California, San Diego
Keywords: diabetic kidney disease; penalized regression; lasso; random forest
Abstract:

There is an urgent need to identify novel biomarkers that can predict and elucidate mechanisms of diabetic kidney disease (DKD). Urine samples from 995 CRIC patients were assayed for relative metabolite abundance yielding 1899 annotated features. Stringent filtering criteria were applied to eliminate noisy features, resulting in 698 reliably measured features. We fit prognostic models for kidney function decline (defined as eGFR slope), using penalized (Lasso) and Random forest (RF) models, with metabolites and clinical predictors (demographics, BMI, blood pressure, HbA1c, eGFR, albuminuria). The models with lowest prediction error were further evaluated on the time-to-ESRD outcome via c-statistics. Five-fold cross validation was used to obtain the c-statistics (median, 95%CI). The eGFR slope models selected 9 - 122 features depending on lasso penalty and RF variable importance. The best ESRD model with 20 metabolites & 9 clinical factors, had a c-statistic of 0.85 (0.85, 0.86). Pathway analysis revealed that prognostic metabolites were involved in cell signaling, energy storage etc. Modern statistical methods applied to untargeted metabolomics can reveal novel insights in DKD.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program