Online Program

Return to main conference page
Thursday, May 30
Data Science Techologies
Shared Infrastructure for Data Science
Thu, May 30, 4:00 PM - 5:35 PM
Regency Ballroom AB
 

The Machine Learning Lifecycle with MLflow (305033)

*Siddharth Murching, Databricks, Inc. 

Keywords: Machine Learning, ML, AI, Infrastructure, Workflow, Productionize, Serving, Scoring, Training, Metrics, Model, Deployment, MLflow

ML development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure. In this session, we introduce MLflow, an open source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.