Abstract:
|
It has long been known and there is ample literature in support of the notion that the presence of multicollinearity in a dataset can, and often will, have detrimental effects on one's ability to determine which of the model predictors are actually responsible for, or contributing to, the variation in the measured/observed response (Montgomery, Peck, & Vining, 2001; Pedhazur, 1982). There also exist some indications that the presence of multicollinearity in the data does not, or at least may not, impact one's ability to accurately estimate or predict the value of the response variable for any specific set of measurements/observations on the predictors (Kutner, Nachtsheim, & Neter, 2004; Weiss, 2012). This idea, although seemingly logical on the face of it, is not widely present in regression textbooks, nor is there an abundance of research literature that supports it. The purpose of this study was to examine this relationship, or lack thereof, in a variety of situations that vary in the number of predictors, the strength of the association between the predictors and the response, the size of the sample, and the level of the multicollinearity among the predictors.
|