Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 90 - Dealing with Error-Prone Electronic Health Record Data via Validation Sampling
Type: Invited
Date/Time: Monday, August 8, 2022 : 8:30 AM to 10:20 AM
Sponsor: Biometrics Section
Abstract #320543
Title: Implementing an Optimal Multi-Wave Validation Design in a Multi-National HIV Research Cohort
Author(s): Gustavo Amorim* and Bryan Shepherd
Companies: Vanderbilt University Medical Center and Vanderbilt University Medical Center
Keywords: Design-based estimator; Electronic Medical Records; HIV; Measurement error; Multi-wave study; Validation study

People with HIV (PWH) are at greater risk of developing and dying from Kaposi’s sarcoma (KS), in particular patients with low CD4 count. The uptake of antiretrovirals (ART) has decreased the incidence of KS among PWH, which is expected to have decreased further within the Treat-All era. This hypothesis can be evaluated using Electronic Medical Record (EMR) data from sites in East Africa and Latin America, which include nearly 300,000 PWH. However, it is known that EMR data are often error-prone and a naïve analysis ignoring measurement error may be biased. To overcome this issue, we designed an optimal sampling strategy that selects 1,000 patient records for validation, where we validate key variables with extensive chart review. Our sampling strategy is optimal in the sense that it minimizes the variance of design-based estimators. It is also decomposed into two waves; each wave uses a cluster sampling and Neyman allocation procedure to select and validated records of 500 patients. Both EMR and validated datasets are then used to estimate KS incidence among PLWHIV. We describe our experience designing, implementing, and estimating KS incidence with this validation study.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program