Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 76 - Contributed Poster Presentations: Section on Statistics in Epidemiology
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistics in Epidemiology
Abstract #313154
Title: Case Contamination in Electronic Health Records-Based Case-Control Studies
Author(s): Jill Schnall* and Lu Wang and Michael Levin and Scott Damrauer and Jinbo Chen
Companies: University of Pennsylvania and University of Pennsylvania and University of Pennsylvania and University of Pennsylvania and University of Pennsylvania
Keywords: EHR; EHR phenotyping; case contamination; bias
Abstract:

One challenge in using electronic health records (EHRs) for research is that the true phenotype status of an individual must be derived using information available in the EHR. Phenotyping algorithms are often used to define cases and controls, but it is difficult to balance the accuracy of phenotype classification with sample size. We explore the use of an estimating equation (EE) approach that allows for more relaxed phenotype definitions and corrects the bias introduced by case contamination. Our approach relies on drawing a validation subset from a contaminated case pool and training a phenotyping model to distinguish cases from non-cases. Through simulation studies, we assess the performance of the EE method for bias correction, evaluate the robustness of the method to specification of the phenotyping model, and evaluate the performance of the EE method when the phenotyping model is fit using high-dimensional data methods. Finally, we apply the method to an EHR-based study of dilated cardiomyopathy. We find that our method outperforms other methods used for bias correction and can also perform well when high-dimensional data methods are necessary to fit the phenotyping model.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program