Abstract:
|
Data scientists stand at the forefront of work to promote reproducible research. With its foundation in both statistics and computer science, data science offers a unique blend of statistical concern for reproducibility combined with sophisticated computer science tools for version control and internal checks that can result in highly reproducible data science work. Yet challenges to reproducibility remain: data accessibility, computing constraints, the often rapid-fire pace of data science work, and conflicting incentives from collaborators and stakeholders. In this roundtable, attendees will discuss the meaning of reproducible and replicable research in their own work; consider whether reproducibility is being achieved; survey the broader state of reproducible research in data science; and consider impediments to and supports for advancing reproducible data science research. We will share best practices, preferred tools, and educational resources for making our data science work more reproducible.
|