Visit Domino News for press releases and mentions.
Visit the Data Science Blog to learn about data science trends, tools, and best practices.
By Stig Pedersen, Head of Machine Learning, Topdanmark
Editor’s note: This is part of a series of articles sharing best practices from companies developing an enterprise data science strategy. Some articles will include information about their use of Domino.
As Denmark’s second-largest insurer, we aim to raise the standard for how insurance works for consumers. We want to give consumers a better, faster, and more satisfying insurance experience—whether we’re making policy decisions or processing claims—and we use machine learning and AI to do this.
When we launched our Machine Learning Center of Excellence in 2017, we conducted proofs of concept to understand how we could apply these technologies. We proved we could automate nearly half of one select process using natural language processing on documents. Since most of our processes include documents, the potential became clear to us. We also proved that we could apply image analysis (i.e. computer vision) so our underwriting and claims experts could apply their expertise where it’s most needed.
But to integrate models into our operational workflows and put our first models into production running in real-time by 2018, we had to overcome both technical and non-technical challenges. Below are some of the areas we focused our discussion on during the NVIDIA GTC session. (You can listen to the entire discussion here.)
Ensuring version control and reproducibility
When we first launched our efforts, we immediately realized that without a data science platform, we would have models everywhere on individual machines, making it difficult to scale our work. One of our early steps was working with Deloitte consultants to identify a platform that could help us control versions, reproduce results, access resources, and separate test and production environments. We identified Domino’s Enterprise MLOps platform as ideal for code-first data scientists to accelerate research and scale. For example, our teams can now easily spin up NVIDIA GPUs in the cloud, using Domino to choose the size and power they need so they can quickly test out new ideas. It enables our data scientists to focus on what they’re good at—mathematical modeling and data analysis—using whatever tool or method on the data they choose, in a secure environment where we can document the work and ensure we’re delivering impact.
There’s often a discussion within data science departments on whether to build or buy a data science platform. We did not see building a platform as a good option. Building from scratch can take a long time, and you need an enormous number of resources to maintain and keep adding new capabilities. With Domino, we benefit from their global experience, getting access to new features that we might not have thought to ask for but that can benefit us tremendously.
Investing in institutional knowledge
In building out our Center of Excellence for Machine Learning, we also quickly realized that we needed to build a community of experts and make it easy for them to collaborate. We worked with a leading university to create a masterclass for our team on advanced methods in languages and image analysis. We’ve invested in several profiles—practitioners with domain expertise skilled at interpreting data sources, experts with a deep understanding (Ph.D. level) of advanced techniques, and software specialists. Our masterclass ensured that everyone had a strong understanding of natural language processing and image analysis—both of which are critical to our success. Today, we continue to broaden our expertise, exchanging knowledge with communities across Europe who have the same skill level and the same types of challenges with natural language processing and image analysis.
In addition, we made collaboration a core requirement when we chose the Domino platform. By making it easy for our data scientists to share and build off each other’s ideas, we’ve created a community where team members continually inspire each other to push our capabilities forward.
Managing model drift
We focus on real-time exposures of algorithms, which is technically much more demanding than just having batch models running. As a result, we need to be confident we can maintain the accuracy and quality of the algorithms and prove the value of those automated decisions. This is a challenge we must solve with both skilled people and advanced model monitoring technology. For example, we have an algorithm that saves us millions each year. Before we used Domino to monitor model drift, we had two data scientists digging through data sources for over three months to identify that one data set feeding the algorithm wasn’t formatted correctly and causing model drift. Today, we can see immediately if we have model drift and where to look, saving us significant time.
Some people think you can leave this work to IT as part of traditional ITOps processes. But, typically, IT doesn’t have the knowledge of how changes in data sources might impact an algorithm. I believe real-time model monitoring capabilities within an Enterprise MLOps platform are crucial if you’re going to show you’re delivering the value you expected and resolve any issues quickly.
Balancing risk and innovation
As we scale our work, we must ensure we consider potential risks. Of course, we put down fences to ensure that our use of data meets all regulatory and compliance guidelines. But beyond this, we also worked with our legal team to formulate policies around what our data scientists can do with data and machine learning. Our overriding question is always: Could we stand before our customers and investors and defend our work as reasonable?
We’ve come a long way in the past four years. By investing in our data science community and removing the process and technology hurdles in their way, we have made tremendous gains in innovating our business. For example, we can now immediately approve 25% of our motor claims. In less than two seconds we can also approve up to 65% of our underwriting of one of our most risk-volatile insurances offered to personal customers. And, we can keep experimenting and iterating to push the performance of our algorithms while working with our business leaders to deliver more solutions that can transform our customers’ experience.
Read the Topdanmark case study, Giving Homeowners Answers about Insurance Coverage in Seconds with Model-Driven Policy Approvals.