Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 442 - Disease Prediction, Statistical Methods for Genetic Epidemiology and Mis
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #317922
Title: Using Machine Learning to Identify Patterns of County-Level Trends in Diabetes Prevalence in the United States, 2011-2019
Author(s): Hui Xie* and Deborah B Rolka and Yu B Chen
Companies: CDC and CDC and CDC
Keywords: EM-GMM; County-level; Diabetes Prevalence; Trends

While increases in national diabetes prevalence have slowed in the past decade, county-level trends show tremendous variability. We aim to identify patterns of county-level trends in diagnosed diabetes in the U.S. from 2011 to 2019. Using data from the Behavioral Risk Factors Surveillance System and small area estimation methods, we first estimated the county-level diagnosed diabetes prevalence. We then applied four machine learning methods: K-means, Random Forest, Support Vector Machine, and Expectation Maximization Gaussian Mixture Model (EM-GMM). The Bayesian Information Criterion was used to select the best algorithm in pattern recognition application, and EM-GMM outperformed others. EM-GMM identified three distinct patterns: rates consistently increased, first increased then decreased, and remained roughly unchanged in 38.3%, 41.6%and 12.8% of counties, respectively. Counites in the first pattern had a high proportion of elderly, low-income, and uninsured population, and counties in the third had a high proportion of non-Hispanic Whites population. Identifying patters of diabetes trends can help the researchers and policymakers to better target control and prevention efforts.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program