Abstract:
|
Recent developments in computing technologies are enabling a wide range of applications from large scale genome analysis to real-time predictive maintenance based on streaming sensor data. However, in many organizations, it has been difficult for statisticians and data scientists using R to leverage computing resources, due to a disconnect in software engineering skillsets. RStudio, along with the open source R community, has been bridging this gap by providing intuitive interfaces to the Apache Spark ecosystem. In this session, we provide an overview of the sparklyr ecosystem and show how it enables cluster computing applications.
|