Visit Domino News for press releases and mentions.
Visit the Data Science Blog to learn about data science trends, tools, and best practices.
by Josh Poduska, Chief Data Scientist, Domino on December 15, 2021 in Perspective
Algorithms may be the toast of today’s high-performance technology races, but sometimes proponents forget that, like cars, models also need a regular tune-up. A highly visible and catastrophic AI model failure recently shamed Zillow, the online real estate company that was forced to shutter its home-buying business.
As reported in The Wall Street Journal and other sources, the application of its real-estate algorithm to house flipping ingloriously flopped to an early death. The company’s shares plunged 25% as it announced a quarterly loss of $328 million and reduction of its workforce by 25% (about 2,000 people) due to closing its “Zillow Offers” service. CEO Rich Barton told investors, “We've been unable to accurately forecast future home prices at different times in both directions by much more than we modeled as possible."
Planning for Model Risk
While there is no public consensus yet on why Zillow’s model did not work as planned, this blog is not about Zillow, per se. Our topic is about the lesson that all of us relying on data science should take to heart: Never assume a production model is “done”; something can always go wrong!
Even the best-performing model will eventually degrade for a variety of reasons: changes to products or policies can affect how customers behave; adversarial actors can adapt their behavior; data pipelines can break; and sometimes the world simply evolves. Any of these factors lead to data drift and concept drift, which can result in a drop of predictive accuracy.
To meet such challenges, a model-driven business must adopt a model monitoring policy designed to continuously improve model accuracy. Here are four ways that model monitoring can help you fix bad algorithms.
1: Retrain the Model
If a model has drifted, improving model accuracy can take the form of retraining it with fresher data, along with its associated ground truth labels, that is more representative of the prediction data. However, in cases where ground truth data is available, the training data set can be curated to mimic the distribution of prediction data, thereby reducing drift.
Watch for two types of drift: data drift and concept drift. For data drift, the patterns in production data that a deployed model uses for predictions gradually diverge from the patterns in the model’s original training data, which lowers predictive power of the model. Concept drift occurs when expectations of what constitutes a correct prediction change overtime – despite there being no change in the input data distribution.
2: Rollback the Model
Sometimes rolling back to a previous version of the model can fix performance issues. To enable this form of continuous improvement, you need an archive of each version of the model. You can then evaluate the performance of each prior model version against the current production version by simulating how it would have performed with the same inference data. If you find a prior version that performs better than the current model version, you can then deploy it as the champion model in production.
3: Fix the Model Pipeline
While drift may occur because the ground truth data has changed, sometimes it happens when unforeseen changes occur in the upstream data pipeline feeding prediction data into a model. Retraining with fresher data sourced from the data pipeline may fix the model or fixing the data pipeline itself may be easier.
4: Repair the Model
In order to ensure you are continuously improving model accuracy, you may sometimes need to repair a model in a development environment. To diagnose the cause of the model degradation it helps to use a platform that supports reproducibility, where you can effectively simulate the production environment in a development setting. Once a suspected cause is identified you can choose the best method for repairing the model, whether modifying hyperparameters, or something more invasive.
Model monitoring is a critical, ongoing process that is essential for a model-driven business. However, an unmonitored model can lead to disastrous business results. If you’d like to learn more about how to create a rigorous model-monitoring process for your data science program, read our new white paper, Don’t Let Models Derail You: Strategies to Control Risk with Model Monitoring.