Online Program

Return to main conference page

All Times EDT

Thursday, June 4
Software & Data Science Technologies
Data Science Using R
Thu, Jun 4, 10:00 AM - 11:35 AM
TBD
 

Training Large Deep Learning Models Using Spark, TensorFlow, and R (308337)

Presentation

*Javier Luraschi, RStudio 

In recent years, state-of-the-art deep learning models have required larger hardware specifications to train such models. For instance, OpenAI’s GPT model required hundreds of GPUs to train and in general, other groundbreaking models like AlphaGo, AlphaZero, Dota 1v1 and AlphaStar, also required significant training resources to properly optimize across those complex training spaces. Combining deep learning and distributed computing required teams of engineers to maintain all the infrastructure necessary to properly train these models at scale, without such large teams and investments, training those models was simply inaccessible to many other institutions.

However, recent developments in Spark 2.X (project Hydrogen specifically), introduce new execution frameworks like barrier-execution to unify large-scale data processing with large-compute workflows which, when combined with proper GPU scheduling, enables data scientists and machine learning engineers to run comparable workflows with a fraction of the complexity. In addition, new features in sparklyr enable them and simplify their integration with deep learning frameworks.

Therefore, this talk will present R the friendliest framework to train large-scale deep learning models in TensorFlow and Spark through the sparklyr package. In addition, we will discuss infrastructure building blocks like Docker and Kubernetes which can help you set up clusters with a fraction of the effort that was required just a couple years ago. We will also present previous and recent developments in distributed computing and deep learning relevant to R users.