Online Program

Applying a Logistic Regression Model to Predict the Accuracy of Administrative Healthcare Claims in Identifying Patients with Chronic Kidney Disease

*Zongqiang Liao, Blue Cross Blue Shield of Michigan 
Chelsea Wellman, Blue Cross Blue Shield of Michigan 

Keywords: case definition, claims data, accuracy, health care, logistic regression model, predictors, chronic kidney disease, CKD

Background: Quality improvement and cost control for chronic kidney disease (CKD) rely on accurate identification of CKD patients, which could be done economically with a validated claims-based case definition.

Methods: CKD patients from eighteen 2011-2012 Michigan CKD registries (n=9,735) were used as “gold standards”. Cases were identified in Blue Cross Blue Shield of Michigan administrative data, a claims-based case definition was applied and a logistic regression model estimated the predictors of registry-case definition agreement about the presence of CKD. Results: CKD case definition sensitivity was 14.4%. The logistic regression model was well-fit and estimates for all independent variables were statistically significant. The main predictors were age < 65 vs. >= 65 years (OR=1.94), male vs. female (OR=2.02), primary insurance coverage vs. complementary insurance coverage (OR=1.82), CKD Stages 3A vs. 3B, 4 and 5 (OR=5.17, 15.68 and 16.63, respectively), the interaction between age and insurance coverage (OR=0.42), physician participating a pilot registry (OR=1.18), percentage of billing diabetes (OR=0.56) and hypertension (OR=0.09).

Conclusions: Claims capture a modest proportion of registry cases, but case definition sensitivity could be improved by focusing on particular sample attributes. Diabetes and hypertension billing patterns suggest that more consistent billing of CKD diagnoses may also improve sensitivity.