Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 246 - Data Science
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Computing
Abstract #318053
Title: From Research to Deployment and Back: A Computational Framework for Reproducibility and Replicability from Industry
Author(s): Sergiy O Nesterko*
Companies: Fidelity Investments
Keywords: Replicability; Reproducibility; Computation; Data Science; Production; Deployment
Abstract:

Fueled by the abundance of computational power and data sources, data science is becoming better defined and is gaining wider adoption (Meng, 2020), (Jordan, 2019). Industry data science teams thus have a growing platform to incorporate academic research results as part of the models we put in production, by which we mean repeatedly (or continuously) executed and maintained models that aim to derive value from data.

In industry, one of the most common reasons for "taking a deployed model offline" is to research its ability to replicate the achievement of its intended goals. In this talk, we introduce a computational framework for model development and deployment, centered around a directed acyclic graph (DAG) architecture of data processing and modeling tasks which exchange outputs between one another. The introduced computational DAG framework enables a more seamless iteration between model deployment and research phases, which is warranted by its ability to improve the computational reproducibility of the model in development.

Finally, we introduce the related computational replicability considerations when using the introduced DAG computational framework in industry.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program