Keywords: Apache Spark, Data Science, Platform, Notebooks
Apache Spark was designed to offer a unified engine to support diverse workloads, such as SQL, graph processing, iterative machine learning, streaming, and batch data processing. Although this approach may seem counterintuitive, it offers some unique benefits—most important, applications can combine workloads in ways that are not possible with specialized engines. However, any data practitioner will tell you that a powerful engine does not make a car. Data science is a team sport involving diverse personalities: engineers, statisticians, analysts and managers. These teams require data & model management, version control, access control, resource management, security & user management, collaboration and many more features to effectively function. A unified analytics platform brings all these together.