Using Dimension-Reduction Methods to Identify Interpretable Dietary Patterns Related to Body Mass Index (BMI) in the Multi-Ethnic Study of Atherosclerosis (MESA) (306422)*Natalie Gasca, University of Washington
Robyn McClelland, University of Washington
Keywords: diet, BMI, partial least squares, variable selection, supervised methods
Most studies examining nutrition-disease trends use unsupervised methods, like principal component analysis (PCA), to create diet patterns. But supervised methods, like partial least squares (PLS) and sparse PLS (SPLS), could more concisely reduce the data. This study identifies which of these methods best constructs interpretable diet patterns to predict BMI, a heart disease risk factor. The Multi-Ethnic Study of Atherosclerosis includes 6814 participants aged 45-84, and a Food Frequency Questionnaire assessed their typical diet. BMI was pre-adjusted for age, gender, race/ethnicity, energy consumption, and exercise. PCA created 23 patterns from the 120 diet variables, with a cross-validated root mean squared error (RMSE) of 4.93. PLS improved this by picking 2 diet patterns (RMSE=4.91). SPLS chose 2 patterns too but only used 11 variables (RMSE=4.94). Reporting further, SPLS’s first pattern described hamburger and diet soda intake (positive associations with BMI), and the second featured unfried potato and wine intake (negative associations). By using fewer patterns and foods, SPLS created more interpretable diet patterns while conserving predictive ability, as measured by RMSE.