Abstract:
|
Analyzed appropriately, a study design in which a biomarker is measured in pooled samples from multiple individuals can yield valid estimates of individual-level regression parameters, while often reducing costs and/or improving efficiency. We consider maximum likelihood estimation for linear and logistic regression, with a continuous covariate measured in pools and subject to measurement and/or processing error. We assume (1) a linear model with homoscedastic normal errors for the biomarker given other covariates; (2) normal additive measurement errors and, in pooled measurements, normal additive processing errors irrespective of pool size; and (3) that the two error types are independent. When both error types are present, a hybrid design with multiple different pool sizes is sufficient to consistently estimate regression coefficients. However, a small number of replicate individual measurements substantially improves stability. Using motivating data from the Collaborative Perinatal Project, we apply the proposed approach to assess whether monocyte chemotactic protein 1 is associated with log-odds of spontaneous abortion, controlling for race and smoking status.
|