Nutrition affects cardiovascular disease risk, but is not yet incorporated into risk prediction models used to guide patient-physician discussions. We sought to determine whether nutrition data could improve cardiovascular mortality risk prediction, using machine learning.
We derived and tested risk prediction models with and without nutrition data (using traditional risk factors only), and with and without machine learning (with a standard Cox model as the referent). NHANES data (1999-2010) linked to the National Death Index through 2011 were split into derivation (70%, N=29390) and validation (30%, N=12600) subsamples. The primary outcome was time to cardiovascular mortality from myocardial infarction or stroke.
A standard Cox model with non-nutrition variables had a C-statistic of 0.87 (0.85, 0.88) and calibration slope of 0.53 (0.49, 0.57). The C-statistic and calibration slope minimally improved when including nutrition data into a Cox model, or non-nutrition risk factors into a random forest algorithm. But when nutrition variables and the random forest algorithm were both used, the C-statistic increased to 0.90 (0.89, 0.92) and calibration slope to 0.89 (0.69, 1.09).