Detail from "the second line," a painting by Bob Graham. For more about the artist, click here.

Online Program

Estimating the Size of Hard-to-Reach Populations using Respondent-Driven Sampling Data

*Mark Stephen Handcock, University of California - Los Angeles 
Krista Jennifer Gile, University of Massachusetts 
Corinne M. Mar, University of Washington 

Keywords: link-tracing, chain referral, survey sampling, model-based, Bayesian statistics

The study of hard-to-reach or otherwise "hidden" populations presents many challenges to existing survey methodologies. These populations are characterized by the difficulty in sampling from them using standard probability methods. Typically, a sampling frame for the target population is not available, and its members are rare or stigmatized in the larger population so that it is prohibitively expensive to contact them through the available frames. Hard-to-reach populations in the US and elsewhere are under-served by current sampling methodologies mainly due to the lack of practical alternatives to address these methodological difficulties.

Most analysis of RDS data has focused on estimating aggregate characteristics of the target population, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. In this paper, we present an approach to estimating the size of a target population based on the data collected through RDS. Most analysis of RDS data has focused on estimating aggregate characteristics of the target population, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. In this paper, we present an approach to estimating the size of a target population based on the data collected through RDS. This strategy uses the successive sampling approximation to RDS introduced in Gile (2009) to leverage the information in the ordered sequence of observed personal network sizes. We develop inference within the Bayesian framework that allows prior knowledge of the population size to be incorporated. We show via a simulation study and application to real data that these approaches also improve estimation of aggregate characteristics based on RDS data.

ASA Meetings Department · 732 North Washington Street, Alexandria, VA 22314 · (703) 684-1221

Copyright © American Statistical Association.