Thursday, November 10
Data Quality and Measurement Error
Thu, Nov 10, 3:30 PM - 4:55 PM
Hibiscus B
Sample Design and Incentive Considerations in Pretesting and Development

Sample Adequacy Criteria: A Quality Assessment of Community Involvement in Core Activities Related to Certification and Maintenance of Certification Examination Development (303524)

*Gerald K. Arnold, PhD., MPH, American Board of Internal Medicine 
Robin A. Guille, PhD, American Board of Internal Medicine  
Rebecca S. Lipner, PhD, American Board of Internal Medicine 

Keywords: sample adequacy, representiveness indices, bias, sample validation

We devised a sample adequacy criteria (SAC)for assessing sample quality of physician volunteers in community engagement activities related to developing certification examinations at the American Board of Internal Medicine (ABIM). SAC aids judgments whether volunteer exam reviewers actually represent the expected views of the eligible physician pool invited to review or whether additional resources are put into data collection.

SAC uses three benchmarks for judging sample fit for use. One benchmark is that the number of volunteers meet or exceeds 220 and 12 or more volunteers review each section of the exam material. This benchmark serves the power and reliability requirements for the tasks. Benchmark 2 compares sample auxiliary measures to those of the pool of eligible reviewers using standardized differences. Auxiliary variables include geographic region, sex, age, and % practice time in patient care.

Benchmark 3 uses sample representativeness indices including the “representativeness” R index, “balance” B index, the propensity score coefficient of variation, and propensity score correlations with auxiliary variables. Other metrics come from comparing the volunteer sample with bootstrap samples of nonrespondents of equal size to assess balance among auxiliary variables. If there is sufficient bias, correction with weighting is made. The SAC assumes if sample estimates of the population auxiliary variables are unbiased then so are sample ratings used in the exam activities. The assumption is later verified by comparing correlations between auxiliary variables with actual exam-related ratings.

SAC is used to make one of three determinations: 1) use the sample without weighting or 2) use the sample with weighting or 3) the sample is not adequate so further data collection is required. SAC has been used in volunteer reviews of seven MOC exams for medical subspecialties. Exemplar SAC results from these exams and validating information for the process are presented.