Online Program Home
My Program

Abstract Details

Activity Number: 190 - Contributed Poster Presentations: Section on Statistics and the Environment
Type: Contributed
Date/Time: Monday, July 29, 2019 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics and the Environment
Abstract #307354
Title: Random Forest Models for the Probable Biological Condition of Streams and Rivers in the USA
Author(s): Eric Fox*
Companies: Cal State East Bay, Department of Statistics
Keywords: National Rivers and Streams Assessment; Random Forests; StreamCat; Spatial Prediction

The National Rivers and Streams Assessment (NRSA) is a probability-based survey conducted by the US Environmental Protection Agency. It provides information on the ecological condition of the rivers and streams in the conterminous USA, and the extent to which they support healthy biological condition. An important problem is the prediction of stream condition at new, unsampled locations. Using random forests (Brieman, 2001) we develop a model to predict the probability that a stream is in good (or conversely poor) biological condition. The model is fit to categorical response data consisting of 1365 NRSA survey sites and their designation as being in good or poor condition according to an aquatic health index. The predictor data consist of 212 landscape features from the EPA's Stream-Catchment Dataset (Hill et al., 2015). The out-of-bag performance of the random forest classifier is evaluated with classification rates, the area under the curve, and other graphical summaries. We find that the random forest model performs remarkably well according to these metrics. We also address issues with variable selection and model stability.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program