411 – Case Control and Case Cohort Studies
Decomposition, Gradient, and Reduction Colinearity in High-Dimension Data from a Case-Control Study
Yuanzhang Li
WRAIR
Tianqing Liu
Walter Reed Army Institute of Research
David Niebuhr
Walter Reed Army Institute of Research
The effect of individual predictors in a multiple regression model may be biased due to multicollinearity. Multicollinearity often occurs in longitudinal studies, especially when the objective is to study the association between disease and biomarkers. In this study, we proposed space decomposition to group potential biomarkers according their association and use the gradient-nuisance vector approach to reduce the number of biomarkers included in the highdimension regression. The co-linearity among the biomarkers was dramatically reduced by this method. We used this approach on a US military case control data set to evaluate the association of biomarkers on risk of schizophrenia. The linear correlation among biomarkers was as high as 0.8 and after decomposition, the correlation coefficient among the vectors was less than 0.3. The predictive power of the model as a whole was not reduced. The proposed approach can help investigators to select biomarkers used to identify high risk population for diseases including schizophrenia and can be extended to other case control studies. This work was funded by the Walter Reed Army Institute of Research, Independent Laboratory In-House Research Program W8XWH-11-C-0082.