The Pros and Cons of Spark in a Modern Enterprise Analytics Stack

Whitepaper

Spark is a distributed computing framework that has skyrocketed in popularity over the last several years for data engineering and analytics use cases. This paper provides a brief overview of Spark’s strengths and weaknesses in the context of data science and machine learning workflows.

While Spark is extremely effective with certain types of workloads on very large datasets, it has some drawbacks, including performance overhead for certain workloads, onerous setup and management, and competition from more modern distributed computing frameworks. It is smart for enterprises to understand the pros and cons of Spark so they can implement an analytics technology strategy that incorporates Spark for projects that can benefit from it, and support alternative options when its complexity is unnecessary or even detrimental to the business.

Get the Whitepaper

Latest resources

Guide

Top 10 Questions IT Leaders Should Ask of Data Science Platforms

Report

2020 Gartner Magic Quadrant for Data Science and Machine Learning Platforms

Whitepaper

Kubernetes: The IT Standard for Data Science Workloads

Brief

Accelerate Adoption of SAS® Data Science Use Cases in the Cloud Using Domino

Dun & Bradstreet seal