Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 171 - SPAAC Poster Competition
Type: Topic Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Survey Research Methods Section
Abstract #312755
Title: Connecting Population-Level AUC and Latent Scale-Invariant R-Square via Semiparametric Gaussian Copula and Rank Correlations
Author(s): Debangan Dey* and Vadim Zipunnikov
Companies: Johns Hopkins Bloomberg School of Public Health and Johns Hopkins University, Bloomberg School of Public Health
Keywords: Classification; AUC; Rank Statistics; Variance explained; Copula; Complex Surveys
Abstract:

We employ Semiparametric Gaussian Copula (SGC) to model a joint dependence between observed binary outcome and observed continuous predictor via the correlation of latent standard normal random variables. Under SGC, we show how, both population-level AUC and latent scale-invariant R-square, defined as a squared latent correlation, can be estimated using any of the four rank statistics calculated on binary-continuous pairs: Wilcoxon rank-sum, Kendall's Tau, Spearman and Quadrant rank correlations. We then focus on three implications and applications: i) we explicitly show that under SGC, the population-level AUC and the population-level latent R-square are related via a monotone function that depends on the population-level prevalence rate, ii) we propose Quadrant correlation as a robust semiparametric version of AUC, iii) we demonstrate how, under complex-survey designs, Wilcoxon rank sum statistics and Spearman and Quadrant rank correlations provide estimators of the population-level AUC using only single-participant survey weights. We illustrate these applications using 5 year mortality and continuous predictors from 2003-2006 National Health and Nutrition Examination Survey.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program