The homophily/design effect relationship in respondent-driven sampling data – results from the National HIV Behavioral Surveillance System among injecting drug users
Cyprian Wejnert, Centers for Disease Control and Prevention
Keywords: Respondent-driven sampling, injecting drug users, HIV
Respondent-driven sampling (RDS) is used to study injecting-drug users (IDU). Recent work suggests RDS design effects (DEs) may be large. RDS DEs vary across variables and have been associated with homophily. We analyzed RDS samples of IDU in over 20 U.S. cities collected as part of CDC’s National HIV Behavioral Surveillance System (NHBS). Data from 43 RDS studies (n=21,676) from two NHBS cycles were analyzed. Using RDSAT software, we calculated population estimates, confidence intervals (CIs), and homophily for: gender, race, age, HIV status, and syringe sharing. DE was calculated by comparing RDSAT CIs to those expected under SRS. We used these data to explore the association between RDS DEs and homophily. Homophily and DEs varied by analysis variable, most DEs were between 2 and 4 and highest for high homophily variables. DE was exponentially correlated with homophily (R^2=0.48). When homophily=0 (the ideal case) DE=2.6. Current practice assumes DE=2 and may lead to underpowered samples. Researchers may use DE=4 to calculate power. Formative research should identify variables of high homophily. In extreme cases, separate RDS samples within homophilous groups can be used.