The Pros and Cons of Spark in a Modern Enterprise Analytics Stack

Whitepaper

Spark is a distributed computing framework that has skyrocketed in popularity over the last several years for data engineering and analytics use cases. This paper provides a brief overview of Spark’s strengths and weaknesses in the context of data science and machine learning workflows.

While Spark is extremely effective with certain types of workloads on very large datasets, it has some drawbacks, including performance overhead for certain workloads, onerous setup and management, and competition from more modern distributed computing frameworks. It is smart for enterprises to understand the pros and cons of Spark so they can implement an analytics technology strategy that incorporates Spark for projects that can benefit from it, and support alternative options when its complexity is unnecessary or even detrimental to the business.

Get the Whitepaper

Latest resources

Guide

The Practical Guide to Managing Data Science at Scale

Report

Gartner Report: 15 Insights for Managing Data Science Teams

Whitepaper

Model Monitoring Best Practices

Brief

Accelerate Adoption of SAS Data Science Use Cases in the Cloud Using Domino

Dun & Bradstreet seal