Online Program Home
My Program

Abstract Details

Activity Number: 72 - Semiparametric Modeling
Type: Contributed
Date/Time: Sunday, July 28, 2019 : 4:00 PM to 5:50 PM
Sponsor: Biometrics Section
Abstract #306554 Presentation
Title: Semiparametric Maximum Likelihood for Logistic Regression with Misclassified Response and Covariate Measurement Error
Author(s): Sarah Lotspeich* and Bryan E Shepherd and Pamela Shaw and Ran Tao
Companies: Vanderbilt University and Vanderbilt University School of Medicine and University of Pennsylvania and Vanderbilt University Medical Center
Keywords: Measurement error; Semiparametric; Misclassification; Logistic regression; EHR; Maximum likelihood

Modern electronic health records systems routinely collect variables of clinical interest. However, responses and predictors can be captured with error, and these discrepancies can be correlated. A cost-effective solution to a complete data audit is the two-phase design. During Phase I, error-prone variables are observed for all subjects, and this information is then used to select a validation subsample in Phase II. Previous corrections are limited to misclassified, binary predictors, make distributional assumptions about the error mechanisms, or rely on a validation subsample that is simple or stratified random. We propose a semiparametric approach to two-phase designs with a misclassified, binary outcome and error-prone predictors, allowing for dependent errors and arbitrary second-phase selection. We devise a computationally efficient and numerically stable EM algorithm to maximize the nonparametric likelihood function. The resulting estimators possess desired statistical properties. We demonstrate performance of the proposed method to existing approaches through extensive simulation studies and illustrate use in an observational HIV study.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program