Abstract:
|
We employ Semiparametric Gaussian Copula (SGC) to model a joint dependence between observed binary outcome and observed continuous predictor via the correlation of latent standard normal random variables. Under SGC, we show how, both population-level AUC and latent scale-invariant R-square, defined as a squared latent correlation, can be estimated using any of the four rank statistics calculated on binary-continuous pairs: Wilcoxon rank-sum, Kendall's Tau, Spearman and Quadrant rank correlations. We then focus on three implications and applications: i) we explicitly show that under SGC, the population-level AUC and the population-level latent R-square are related via a monotone function that depends on the population-level prevalence rate, ii) we propose Quadrant correlation as a robust semiparametric version of AUC, iii) we demonstrate how, under complex-survey designs, Wilcoxon rank sum statistics and Spearman and Quadrant rank correlations provide estimators of the population-level AUC using only single-participant survey weights. We illustrate these applications using 5 year mortality and continuous predictors from 2003-2006 National Health and Nutrition Examination Survey.
|