Visit Domino News for press releases and mentions.
Visit the Data Science Blog to learn about data science trends, tools, and best practices.
By David Bloch, Data Science Evangelist, Domino on July 07, 2020 in Perspective
Building models requires a lot of time and effort. Data scientists can spend weeks just trying to find, capture and transform data into decent features for models, not to mention many cycles of training, tuning, and tweaking models so they’re performant.
Yet despite all this hard work, few models ever make it into production (VentureBeat AI concluded that just 13% of data science projects make it into production) and in terms of delivering value to the business, Gartner predicts that only 20% of analytics projects will deliver business outcomes that improve performance.
What’s going on?
There is no singular reason that data science projects have a high rate of failure. Some may attribute the problem to not enough data science professionals or usual management bottlenecks – such as streamlining access to technology or infrastructure, or getting models into production.
Certainly, organizations need to hire expert data science professionals. These professionals need to be backed by data science platforms and technologies that empower them to do what they do best, which is to explore, experiment and solve business challenges. For example, implementing a data science platform to act as the system of record for development efforts helps create clarity on the technical process of developing models and deploying them into a production state.
Beyond these, however, we’ve found based on our experience working with large enterprise customers across industries, that many data science projects fail to deliver value because, at the highest level, Data Science and the business simply aren’t connected. This devolves into a variety of challenges:
- Data science projects start without business support or a key stakeholder to act as the domain expert during model development.
- Organizations struggle to translate business challenges into solvable data science problems. Often, data science teams emphasize finding novel insights rather than pinpointing how a business process can be improved. This means many projects get stuck in endless research and experimentation.
- The question being asked cannot be sufficiently answered with the available data or the costs associated with data extraction from systems are too high due to a lack of connectivity across applications.
- There is a lack of clarity between Data Science and IT teams on how to deploy models into production. Without a feasible, repeatable method for deploying model code, models can languish on the shelf while business opportunity passes.
- Businesses often struggle to implement effective change management to adopt and implement the findings from models.
Boosting your odds of success
To avoid these missteps and ensure projects have a higher chance of success, data scientists should start each new project by answering three critical questions:
- Do we have a business problem with a clear path to value?
- Is the problem feasible for us to solve?
- Can the business make the necessary changes resulting from data science insights?
It’s important to note that answering these questions often requires an exploration phase during which data scientists work with business stakeholders to assess the problem, profile the data available, and build a rough idea of their approach to solve the problem.
While this initial phase is crucial, we find many projects get stuck here. A good rule of thumb is:
If you’re unable to answer these questions with business stakeholders within two to four weeks – or at least be able to identify what is required to be able to answer them – the odds of a successful outcome diminish.
Let’s dig into each question more deeply.
Do we have a business problem with a clear path to value?
Data Science creates value by providing an evidence-based approach to decision making. Decisions made based on model results should ultimately reduce cost or increase revenue.
At the outset, data science teams should seek to clearly define the business problem as well as the path to generating value with business stakeholders. A concise goal (“I want to save costs by optimizing my store staffing roster”) is much more likely to lead to a successful initiative than a broad statement (“I want to increase revenue in my retail stores”).
Problem statements should ultimately:
- Include a problem definition.
- Identify a driver of cost or revenue within the business.
- Identify a source of variability in that driver.
- Identify metrics that measure this variability.
- Identify a clear path to creating quantifiable value.
Let’s walk through the retail example of optimizing staff headcount in stores.
- I want to save costs by optimizing staffing rosters in my store without sacrificing customer experience.
- Reduce staffing costs.
Source of Variability:
- The total number of customers.
- The total number of staff.
- Foot traffic (footfall) monitoring of total customer count in store based on time series
- Historical staffing rosters
- Seasonality / marketing campaign impacts
Clear Path to Value:
- By predicting the number of customers in the store, we can organize our staffing roster to sustain our ability to service those customers during peak periods while also reducing staffing costs during off-peak periods.
Through this process, we now have a concise business problem with a clear path to value that we can solve.
Is the problem feasible for us to solve?
There are many different factors to assess whether the problem identified in the problem statement can actually be solved.
Some of those factors include:
- Do we have enough data available, and is it accurate enough? If the data isn’t high enough in quality or you can’t get consistent access to it, chances are the project is dead on arrival. Even if you do have the data, if you don’t understand what the data means or a subject matter expert isn’t available to assist you, any insight being generated won’t be understood well enough to turn into a repeatable solution to the problem.
- Do we have the technical capabilities to produce the model?
- Do we have business support that can be involved in our approach to solving the problem? If the business isn’t willing to provide resources to the data science team as they work on the problem, the likelihood that your model will be adopted and implemented diminishes. Gaining business support through the development of the model significantly increases the chances of uptake once you deploy it.
- Are there any potential issues that may be critical failure points in deploying a model? For example, running afoul of regulatory standards or company values are of course deal breakers, as can be less obvious issues such as using customer data in ways that are permissible but may lead to negative public perception and reputation damage.
Can the business make the necessary change?
Model adoption is a critical challenge in implementation.
You may have identified scenarios where the business can save money or increase revenue. Still, if the company doesn’t implement any of those scenarios through change management programs, the model is meaningless.
In many failed projects, communication between subject matter experts and data scientists stops once the problem statement is approved.
Fostering two-way transparent communication between data scientists and the decision-maker responsible for the value driver is crucial. Sharing insights early and often helps the business understand the actions they need to take to leverage the model output.
Assessing whether a business unit is able to make a change often comes down to working with business analysts and solution architects to understand their business processes and any technology implications that changing them would generate.
Data Science transforms the way business operates by creating evidence-based insights that improve actions and decisions.
Increasing the success rate of data science projects requires a partnership between data science teams and decision-makers to ensure that models are appropriate and can be adopted.
By standardizing these questions before development starts, data science teams can build a repeatable approach that identifies value drivers for potential business problems, gain business support during the development of their models, and work with the business to increase the chances that their model results are adopted.
Here are some additional materials that can help you ensure success from data science projects: