Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 352 - Small Area Estimation, Analysis of Complex Sample Survey Data, and New Advances for Health Surveys
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 10:00 AM to 11:50 AM
Sponsor: Survey Research Methods Section
Abstract #317910
Title: Machine Learning Methods: A Case Study Using Online Web-Based Panel Surveys
Author(s): Yulei He* and Guangyu Zhang and Van Parsons
Companies: US Centers for Disease Control and Prevention and CDC and CDC
Keywords: probability panel; online survey; machine learning; prediction; performance; statistical estimates
Abstract:

In the era of data science, machine learning (ML) methods play a prominent role as they are capable of handling data with large volume and complex structure. Complex sample surveys are a major approach for collecting information from target populations aimed at producing reliable estimates for scientific research. In the past decade or so, web-based panel surveys have become an important tool for collecting data due to their advantages regarding timeliness and cost compared to the traditional survey data collection methods. In this project, we demonstrate the use of established ML methods and investigate their utilities using web-based panel survey data. This is illustrated by using the first two surveys from the Research and Development Survey, which is a series of health surveys based on probability-sampled web-based panels and conducted by the U.S. National Center for Health Statistics. Specifically, we evaluate the performance of a variety of ML methods (e.g., regularized regressions, tree-based methods, deep-learning) for predicting health outcomes in the survey (e.g, body mass index). Our results and experiences might be helpful for others applying ML methods to their data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program