Managing Data Science in the Age of GDPR
As AI adoption has grown, so too have concerns about data protection and infrastructure security across the MLOps lifecycle. At GTS Data Processing, a rapidly growing German IT company, security is top of mind as they deliver Infrastructure-as-a-Service and Software-as-a-Service platforms to companies across Europe. GTS' DSready Cloud offering, powered by Domino® and hosted in Germany, brings together the tools, technologies, compute, and collaboration capabilities its clients need to deliver and manage data science capabilities at scale—all within a GDPR-compliant environment that supports Germany’s stringent security standards.
By embedding Domino within its ecosystem, GTS can offer its customers a full suite of capabilities to accelerate data science research, boost collaboration, and speed deployment so they increase model velocity and achieve impact and scale faster. For example, using Domino, GTS enables its clients to rapidly provision compute resources as they develop models, more easily reproduce, reuse, and share work, and deploy models in a fraction of the time, in some cases within minutes.
It’s common for organizations to feel tension between the desire to empower fast-paced innovation and the need for governance and security. This tension—and, in particular, the demand for governance and security—often slows organizations’ path to the cloud. Lutz Kirchner, GTS founder, recalls one instance where a German IT executive talked about transporting his company’s data via armored truck from one corporate site to another to ensure there were no data leaks. There was no mode of electronic data transfer deemed safe enough—a concern that is especially pervasive in Germany.
Such stories inspired GTS' DSready Cloud solution—a fast, efficient, and security-rich ecosystem for data science. The ecosystem combines the tools, libraries, workspaces, notebooks, data, and hardware resources for more rapidly and securely developing data science capabilities in the cloud.
As Kirchner explained, “Germany has some of the toughest data protection regulations in the world, and most companies here are in their early days with data science and AI. They need a platform to help them build data science capabilities at scale while removing some of the risks of cloud adoption. Additionally, since the COVID-19 pandemic began, many corporate budgets have been cut, making access to secure cloud solutions that offer consumption-based pricing increasingly important for companies of all sizes—not just in Germany. By meeting the strict standards here at home, we can fulfill an unmet need and build trust across Europe and beyond.”
For GTS, an essential element of its DSready Cloud offering is to provide its customers with a platform for data science teams. The data science platform makes it easy for data scientists to access the compute resources, tools, and data they need, while also improving collaboration with stakeholders across the development lifecycle and automating reproducibility of work. This reproducibility streamlines governance processes and increases model velocity to fuel innovation.
Kirchner and Diego Volkenandt, the company’s head of Product Management, evaluated platforms from Domino, Kubeflow, H20.ai, Anaconda, and Dataiku before choosing Domino.
“We fell in love when we first tested Domino. None of the other tools matched Domino’s comprehensive Enterprise MLops capabilities and security features, nor were the other tools as easy-to-use and intuitive. With Domino, data scientists, IT experts, and business users don’t have to go through a long and time-consuming learning process to use and collaborate on the platform.”
—Diego Volkenandt, Head of Product Management at GTS
Four capabilities, in particular, stood out during Volkenandt’s and Kirchner’s testing.
- Openness. Domino supports a broad ecosystem of tools, which enables data scientists to use their preferred IDEs, languages, and packages—including both open-source development tools and commercial tools like SAS, along with legacy homegrown tools.
Self-service infrastructure. Domino offers simplified, self-service access to compute resources, orchestrating all the services necessary for accessing and managing CPU and GPU clusters when training models.
“Data scientists don’t have to configure anything. They can simply select the number of GPUs, RAM, and other parameters, and the platform provisions it in minutes.”
—Lutz Kirchner, Founder of GTS
- Robust collaboration and governance. Domino automatically tracks all work and progress over time so that data science teams can easily reproduce, reuse and share work. “Reproducibility is a significant issue for many companies, especially in industries like healthcare, pharmaceutical, and financial services,” said Volkenandt. “During audits, these companies need to show how a model makes the recommendations and predictions it does, and how all parts fit together. Domino does an excellent job at this, enabling companies to compare the differences between models and share work among different data science teams. Additionally, having this enables data science teams to continually build on their knowledge and develop new solutions faster."
- Enterprise-grade security including strong capabilities for authentication, enterprise application monitoring, and security reporting that support Germany’s stringent legal and regulatory requirements.
“Using the DSready Cloud with Domino enables companies to retain full control of their data and gain the highest levels of access protections and encryption."
— Lutz Kirchner, Founder of GTS
(To learn more about DSready Cloud security features, read the blog “It’s Cloudy in Germany, with an Opportunity for Fast, Efficient and Safe Data Science.”)
The Domino Effect
GTS developed its DSready Cloud with a simple goal: help companies reduce the time and cost of scaling data science while providing a safe and secure environment hosted in Germany to facilitate cloud adoption.
To help clients save on costs, Kirchner is fanatical about ensuring companies aren’t paying for compute resources that they don’t need. “Many times, organizations are left to figure out cloud management themselves,” he explained. “We are very hands-on, both during set-up to make sure companies have the right infrastructure based on their expected needs, and on an ongoing basis, monitoring hardware usage to see if they can switch to a less expensive hardware tier.”
Additionally, building its DSready Cloud on the Domino Enterprise MLOps platform is foundational in helping companies save time and accelerate the data science lifecycle so they can bring new data science solutions to market faster.
This includes faster development of models by streamlining R&D processes and enabling data science, business, and IT teams to seamlessly collaborate throughout each project’s lifecycle.
“Domino helps remove the wall that often exists between the business, data science and IT to bring them closer together, which enables them to solve business problems faster and more effectively.
—Diego Volkenandt, head of Product Management at GTS
“For example, data scientists can in minutes create web applications and forms so a business user can see the progress they’re making, test the model, and provide input early,” said Volkenandt.
It also includes faster deployment of models. Domino’s automatic reproducibility capabilities streamline governance processes and the platform’s publication capabilities to rapidly turn models into APIs or export them to Docker images and CI/CD pipelines.
“Companies can move from a testing and development environment to production very quickly, in some cases in minutes, with Domino.”
—Lutz Kirchner, Founder of GTS
Ultimately, for Kirchner and Volkenandt, it all adds up to greater innovation. With Domino, they know they have a robust platform that allows data science teams to spend their time on solving business challenges rather than on overcoming technical challenges.