Those who ran AWS did so inefficiently; without a standardized, simple way to configure and run AWS. “We want our researchers focusing on the task at hand and not necessarily the infrastructure behind the scenes,” said Subutai Ahmad, Numenta’s vice president of research.
In addition, the team hadn’t standardized practices around documentation of experiments. This resulted in duplicative efforts and lack of reproducibility. “You easily lose track of processes and have to rerun things,” Ahmad added.
Numenta implemented the Domino data science platform to satisfy the following needs:
Easily run experiments in the cloud. Accommodate ever-changing needs for compute power to support data science workflows, while minimizing manual overhead.
Facilitate rapid experimentation. Make it easy for researchers to tweak parameters, understand what they’ve already looked at and optimize algorithms.
Improve tracking and reproducibility. Enforce a standardized way of tracking work history and replicating it in the future.
“With Domino, you’re able to set up an environment once and then share it and reuse it,” explained Austin Marshall, a senior staff engineer at Numenta.
Domino makes it easy to test algorithms at different size and performance levels, switching seamlessly between compute environments without worrying about infrastructure. The platform’s integration with data science tools such as Jupyter Notebooks is also convenient, supporting existing workflows. And by automatically tracking every step of the data science process in Domino, Numenta preserves past work and makes it easy to publish results.
This is important so that, for example, external audiences leveraging its algorithms under the Affero General Public License (AGPL) can reproduce the same results. Ahmad added, “When you need to execute your experiments, you’re operating within Domino’s execution environment where you’re paying by the minute as opposed to the sum cost you’d have with AWS.”
Beyond the platform benefits, Numenta values working with the Domino team. “The support team at Domino has been fantastic,” said Ahmad. “They’ve been really responsive to requests and over time have built a bunch of new features that make our lives a lot easier.”
The Domino Effect
By relieving infrastructure headaches and allowing researchers to experiment in parallel, Numenta researchers achieve more. They deliver results in hours that would have previously taken weeks, and the results have greater accuracy because they’ve been more thoroughly tested.
One key deliverable that Numenta brought to market after implementing Domino was its Numenta Anomaly Benchmark (NAB). NAB is the first benchmark for evaluating anomaly detection models for streaming data. “When developing NAB, we easily saved a couple weeks of time, if not more, thanks to productivity gains afforded by Domino,” said Ahmad.
And Numenta was able to test more than twice as many algorithms against it as they would have had the capacity to do pre-Domino. Next, Numenta is tackling even bigger challenges. “The next generation of algorithms that we’re working on are 5 to 10 times larger than the ones we use in NAB. When we want to run large-scale experiments, testing parameter combinations or seeing how the algorithms scale in different situations, you often have to run thousands of individual experiments,” summarized Ahmad.
“Domino is helping us tackle problems we wouldn’t be able to otherwise, because they’re just too large for laptops.”