Online Program

Return to main conference page
Thursday, May 17
Data Science
Big Data Analytics Using R and Spark
Thu, May 17, 1:30 PM - 3:00 PM
Grand Ballroom G
 

Data Science at Scale With R and Sparklyr: Architecture, Ecosystem, and Current Developments (304516)

*Kevin Kuo, Rstudio 

Keywords: big-data, spark, distributed-computing, tidyverse, machine-learning, rstats

The sparklyr package, which provides an R interface to Apache Spark, has democratized big data analytics for R users. In this session, we first provide an overview of the technical architecture that enables extensibility for implementing new features. A diverse range of use cases are showcased, including graph analysis and natural language processing. We then provide an update on current active developments by the community and preview upcoming enhancements.