Abstract:
|
While increases in national diabetes prevalence have slowed in the past decade, county-level trends show tremendous variability. We aim to identify patterns of county-level trends in diagnosed diabetes in the U.S. from 2011 to 2019. Using data from the Behavioral Risk Factors Surveillance System and small area estimation methods, we first estimated the county-level diagnosed diabetes prevalence. We then applied four machine learning methods: K-means, Random Forest, Support Vector Machine, and Expectation Maximization Gaussian Mixture Model (EM-GMM). The Bayesian Information Criterion was used to select the best algorithm in pattern recognition application, and EM-GMM outperformed others. EM-GMM identified three distinct patterns: rates consistently increased, first increased then decreased, and remained roughly unchanged in 38.3%, 41.6%and 12.8% of counties, respectively. Counites in the first pattern had a high proportion of elderly, low-income, and uninsured population, and counties in the third had a high proportion of non-Hispanic Whites population. Identifying patters of diabetes trends can help the researchers and policymakers to better target control and prevention efforts.
|