Data Quality and Measurement Error
Alternative Ways of Thinking About and Measuring Validity and Reliability in Surveys

Implementing a Test-Retest Strategy to Evaluate the Quality of a Translated Survey Instrument (303162)

Robert P Agans, The University of North Carolina 
Marcella H Boynton, The University of North Carolina 
*Quirina Vallejos, The University of North Carolina 

Keywords: Spanish, translations, reliability, validity

The dramatic rise of the Latino population in the United States has led policy-makers to focus on this important but understudied segment of the population. One of the chief barriers to conducting health policy relevant research with the Latino population, however, is the scarcity of well-validated Spanish-language instruments that allow direct comparability with other racial/ethnic groups. Though there are several techniques available for translating existing questionnaires into other languages, many studies fail to adequately test the reliability of those translations in the field. In this study, a test-retest strategy was implemented to assess the reliability of an English instrument translated into Spanish. Specifically, over a 14-day test-retest period, 45 bilinguals completed both English and Spanish versions of a national telephone survey measuring public awareness, knowledge, risk perceptions, and use of tobacco products. Our analysis not only compared the means and standard deviations of within-subject scores, but also the coefficient of correlation (i.e., where the scores on all parallel tests should be correlated with one another to exactly the same degree and are equal to the ratio of the variance of true scores to the variance of observable scores) and reliability coefficient (i.e., the scores on all parallel tests should be correlated to exactly the same degree with the scores on any other variable). If analysis satisfies these criteria, then the two parallel forms exist. To the degree it does not, then the target form is less equivalent to the source form and the resulting subgroup comparisons will be suspect. Such assessments, we argue, are necessary to establish the reliability of translated instruments and are critical if we are to produce meaningful ethnic estimates and bi-cultural comparisons when multi-language survey instruments are used in the field.