Abstract:
|
Distinct models of the same data should yield different predictions. So should the same model applied to distinct sets of genetic data. We have developed a program using simple kappa (K), weighted kappa (Kw), and extensions thereof to assess subject by subject agreement to disclose pairwise discordant prediction. We used 142 independent subjects from the GAW18 WGS Data Set to predict smoking status, SBP and DBP using two nested models. Results for smoking showed high agreement between nested models (K = 0.93 for dA, K = 0.83 for dB), poor agreement with true values (K = 0.15 for dA and K = 0.19 for dB and 78% of pairs agreeing), and modest agreement between distinct datasets (K=0.31 for model1 and K = 0.33 for model2 and 91% of pairs agreeing). Results for DBP and SDP were alike. K statistics more sharply discriminate than the percentage of pairs that agree. Such general form of K we have encoded account for clustering of subjects and repeated measures by extending from simple regression to mixed models with random effects. We offer a data-driven definition of agreement for continuous outcomes. Forming a global K for composite outcomes creates overall comparison diagnostics.
|