Online Program

Return to main conference page

All Times EDT

Thursday, June 4
Practice and Applications
Practice and Applications 5
Thu, Jun 4, 11:40 AM - 12:45 PM
TBD
 

A Paradigm for Managing Computational Reproducibility in a Changing Software Package Landscape (308383)

Heike Hofmann, Iowa State University 
*Kiegan Rice, Iowa State University 

Keywords: reproducibility, data pipelines, R

Achieving computational reproducibility within data science pipelines is a dynamic, shifting task. Package development for data science is happening at a very rapid speed, both in R and python, the two main scripting languages for Data Science. This means, that an implemented data pipeline might produce different results due to a change in the underlying dependencies. Focusing on the R software we propose a paradigm for managing computational reproducibility that assists users in not only identifying when a package's functionality has changed, but also identifies whether that change will impact the results of a user's project code.