Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 286 - Small-Area Estimation and Survey Methods Sampler
Type: Contributed
Date/Time: Tuesday, August 9, 2022 : 10:30 AM to 12:20 PM
Sponsor: Survey Research Methods Section
Abstract #322332
Title: Borrowing Strength from Nearest Neighbors to Improve County-Level Estimates When Very Few or Zero Individuals Are Sampled
Author(s): Hui Xie* and Deborah B Rolka and Yu B Chen
Companies: CDC and CDC and CDC
Keywords: Small area estimates (SAE); county-level; No sampled or too small sample size; machine learning
Abstract:

In studying county-level variation in disease prevalence, small area estimation (SAE) techniques are used when survey sample sizes are too small to provide adequate direct estimates. For some small counties, the sample size may even be zero. SAE models borrow strength from neighboring counties using individual-level information on the outcomes of interest and auxiliary variables. However, heterogeneity of auxiliary variables across counties can introduce large variations and bias in SAE and thus diminish the estimates' accuracy. To address this issue, we developed a two-stage SAE approach: first, we used the Gaussian Expectation-Maximization mixture model (a machine learning technique) to cluster nearest neighbors among U.S. counties based on the county-level population characteristics (e.g., age, sex, and race) and socioeconomic factors (e.g., income, unemployment, etc.); then we applied Bayesian hierarchical models to estimate county-level prevalence by borrowing strength within the nearest neighbor cluster. The new approach was evaluated with both empirical and simulation data. Using this approach on average reduced the mean square of error by 23.8% and bias by over 1.5 times.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program