Abstract:
|
We consider a situation where there is rich historical data available for the coefficients and their standard errors in an established regression model describing Pr(Y = 1|X), from a large study. We would like to utilize this summary information for improving estimation and prediction in an expanded model of interest, Y|X, B. The additional variable B is a new biomarker, measured on a small number of subjects in a new dataset. We develop and evaluate several approaches for translating the external information into constraints on regression coefficients in Y|X, B. Borrowing from the measurement error literature we establish approximate relationship between the regression coefficients in the models Pr(Y = 1|X, ?), Pr(Y = 1|X, B, ?) and E(B|X, ?) for a Gaussian distribution of B. For binary B we propose an alternate expression. The simulation results comparing these methods indicate that historical information on Y|X can improve the efficiency of estimation and enhance the predictive power in the model Y|X, B. We illustrate our methodology by enhancing the High-grade Prostate Cancer Prevention Trial risk Calculator, with two biomarkers prostate cancer antigen 3 and TMPRSS2:ERG.
|