Keywords: predictive modeling, random forest, principle component analysis, variable selection, water chemistry, salinity, specific conductivity
Salt concentrations in streams (measured as Specific Conductivity [SC] in µS/cm) are essential in assessing the aquatic conditions of our national river systems. SC values > 3000 µS/cm indicate salt pollution, which leads to degradation of environmental conditions and infrastructure. Our two objectives were to (1) evaluate how changes in watershed SC is related to natural landscape features and human disturbance across the contiguous U.S. using an empirical model and (2) assess the effects of drought on SC. We calculated human alteration of SC by subtracting previous modeled estimates of naturally occurring SC from monthly SC observations (n = 1082181) from January of 2000 to December of 2015. Each observed alteration was then matched to 131 predictors characterizing upstream natural and human environmental factors. We modeled the association between alteration and environment using a random forest model (ntrees = 500) in R. Due to limited computation power we used only 10% of the processed data (n = 68513, p = 131, chosen by a spatially stratified random sample) in our initial model. We used a principle component analysis (PCA) to reduce the number of variables retained in the final model. The PCA axes were rotated using the varimax rotation, and then the variables with the highest loading and the strongest univariate relationship on each axis were selected. We then examined partial dependence plots and selected the strongest predictors for the model (p = 43). The chosen predictive model had reasonable performance [R^2 = 0.664, RMSE = 0.666]. We validated this test model using external validation data. We can now map predicted alteration and have identified the human activities that had the most impact on salinity across the nation. Current limitations lie in computation power, which may be addressed with high-end computers or utilizing R packages optimized for big data. These are the authors' views and do not necessarily represent views or policies of U.S. EPA.