Abstract:
|
Many investigators are now interested in combining biomarkers to predict an outcome of interest or detect underlying disease. This endeavor is complicated by the fact that many biomarker studies involve data from multiple centers, which can increase power and improve generalizability. However, depending upon the relationship between center, biomarkers, and the target of prediction, care must be taken when developing and evaluating combinations of biomarkers. We introduce a taxonomy to describe the role of center, and consider how the biomarker combination should be estimated and evaluated. We show that ignoring center, which is frequently done by clinical researchers, is often not appropriate. The limited statistical literature proposes using random intercept logistic models. We have found that this approach is often inadequate or misleading. One solution is fixed effects logistic regression, which leads to appropriate estimates in a variety of situations. After developing the biomarker combination, we recommend using performance measures that account for the multicenter nature of the data, namely the center-adjusted area under the receiver operating characteristic curve.
|