Keywords: data analysis,data science,decision making,ecological system,GAM,generalized additive models
The collection and storage of environmental and ecological data by researchers, government agencies and stewardship groups over the last decade has been remarkable. The proportional challenge to this data accretion lies in capitalizing on these resources for significant gain for both stewards and stakeholders. These trends highlight the role of data science as a critical component to the future of data-driven environmental management. Most critical are models of how data scientists can collaborate with policy makers and stewards to offer tools that leverage data and facilitate decisions. Our project aims to show how a successful collaboration between a management group, the Susquehanna River Basin Commission (SRBC), and an academic group of data scientists resulted in that clarifying insight. The mandate of SRBC is to manage stakeholder requirements while sustaining a healthy ecosystem. The challenge was to differentiate signal events in water quality measurement data from the noisy dynamics of a monitored complex system in a manner that could be applied to other ecosystems. Through the application of generalized additive models (GAM), we were able to clarify the relationship between environmental dynamics and two critical biological communities (macroinvertebrates and fish) that live within the watershed. The GAM model sensitivity was sufficient to identify signal from the noise, and flexible enough to operate across the spatial extent of the ecosystem. By identifying signal events, environmental stewards and policy makers will be able to define thresholds that need to be monitored to reduce pollution & raise diversity in the ecosystem.