A Comparison of Variable Importance Measures for Patient-Reported Outcomes
Bolanle M. Dansu, Deparment of Statistics, University of Agriculture, Abeokuta, Nigeria 
Lisa M. Lix, School of Public Health, University of Saskatchewan 
*Tolulope T. Sajobi, School of Public Health, University of Saskatchewan 

Keywords: patient-reported outcomes,discriminant analysis,logistic regression, multivariate analysis of variance,variable importance

Background: Descriptive discriminant analysis, logistic regression analysis, and stepwise multivariate analysis of variance procedures have been used to describe the relative importance of a set of correlated variables for discriminating between two independent groups. Clinical studies about patient-reported outcomes such as health-related quality of life can use these procedures to identify the importance of outcomes for discriminating between treatment and control groups. The study purpose was to compare six measures of variable importance to rank order a set of correlated variables. Methods: The investigated measures included standardized discriminant function coefficients (SDFCs), discriminant ratio coefficients (DRCs), total discriminant ratio coefficients (TDRCs), F-to-remove statistics (FTR), standardized logistic regression coefficients (SLRCs), and Pratt’s index for logistic regression (LPI). The measures were investigated using Monte Carlo techniques. The manipulated conditions included sample size, number of outcomes, degree and pattern of group mean separation, magnitude of correlation, covariance structures, degree of covariance heterogeneity, and population distribution. Kendall’s concordance statistic (KCS) and percentages of any-variable, all-variable and average per-variable correct ranking were used to evaluate the procedures. Results: While the KCS decreased as the magnitude of correlation among the variables increased, the decrease was smallest for LPI when the variables had a compound symmetric covariance structure. Percentages of all-variable and per-variable correct rankings decreased as the separation between group means increased, but the percent change was smallest for DRCs and LPIs. For each correlation structure, the percentage of all-variable and per-variable correct rankings for DRCs and LPIs were largest when data were sampled from a normal distribution and smallest when data were sampled from a heavy-tailed distribution. Conclusions: The DRCs and LPIs showed higher percent correct rankings among all the investigated measures. The investigated measures can be used to prioritize patient-reported outcomes for clinical investigations or to develop parsimonious models for further statistical analyses.