Even the most sophisticated data science organizations struggle to keep track of their data science projects. Data science leaders want to know, at any given moment, not just how many data science projects are in flight but what the latest updates and roadblocks are when it comes to model development and what projects need their immediate attention.
But while there are a legion of tools for individual data scientists, the needs of data science leaders have not been well-served. For example, a VP of Analytics at a wealth management company recently told us he had to walk around the office, pen and notepad in-hand, going from person to person, in order to get an actual count of projects in flight because their traditional task tracking tools didn’t quite align with the workflow used by data science teams. It turned out that the final count was way off from the initial estimate provided to the CEO.
Data science leaders face a common set of challenges around visibility and governance:
- They need help tracking projects
- They need help tracking models in production
- They need help building a culture of following best practices
Given the potential repercussions from inaccurate information (from mis-set expectations, funding mismatch to project delays) it didn’t surprise us that data science leaders packed the room at the Rev 2 Data Science Leaders Summit in New York for a live demo of our new “Control Center” functionalities designed specially for them.
P.S. If you missed Rev this year, session presentations and recordings can be found here.
Last fall, we delivered the Domino Control Center aimed at IT stakeholders with visibility into compute usage and spend. Today we are announcing a significant expansion of the Control Center with new features for data science leaders in Domino 3.5.
Domino 3.5 allows data science leaders to define their own data science project life cycle. A new addition to the Control Center, Projects Portfolio Dashboard, allows data science leaders to easily track and manage projects with a holistic understanding of the latest developments. It also surfaces projects that need immediate attention in real time by showing the projects that are blocked.
Project Portfolio Dashboard
A data science leader can start their day in the Project Portfolio Dashboard, which shows a summary of in-flight projects broken down by configurable life cycle stages with immediate status update of all projects.
Project Stage Configuration
Every organization has their own data science life cycle that meets their business needs. In Domino 3.5, we enable data science leaders & managers to define their own project life cycle and implement within their teams.
Data scientists can update their project stages as they progress through the lifecycle which notifies their collaborators via email.
Projects owners and contributors can use the project stage menu to flag a project as blocked with a description of the blocker. Once resolved, the project can be unblocked. On the flip side, when data scientists mark a project as complete with a description of the project conclusion, Domino also captures this metadata for project tracking and future references. All of this metadata captured can be useful for organizational learning, organize projects and help to avoid similar issues in the future.
All of this information powers Domino’s new Projects Portfolio Dashboard. Data science leads can click through to gain more context on any of the in-flight projects and discover blocked projects that need attention.
In the hypothetical project below, our Chief Data Scientist Josh sees that one of the blocked projects is Avinash and Niole’s Customer Churn project. Although he doesn’t recall the details of this project, he can see that it is in the R&D phase and has a hard stop in a few weeks. Diving into the project, he can see that the remaining goal is to get a classification model with AUC above 0.8.
Josh can turn to the Activity Feed to get details on the blocker, the causes and suggest a course of action. In this example, he will ask the Customer Churn team to try a deep neural net. He can tag Sushmitha, a deep learning expert working on another team, and ask her to mentor this effort.
Managing projects, tracking production assets, and monitoring organizational health require new tools. These unique features were custom-built for data science leaders. At Domino, we are excited to see these benefits come to you as you use them with your teams.
All of this has been just some of what’s new in Domino, we also have a few other enhancements to our existing features in the 3.5 release. For example, Activity feed has been enhanced to show a preview of the files that are being commented on. It also shows the project stage updates and if any blockers have been raised by collaborators. Users can also filter by the type of activities. This combined with email notifications will ensure situational awareness of the projects at all times.
Domino 3.5 offers the options for users to create large Dataset Snapshots directly from data sitting on their computers. The upload limits on the CLI have been increased to 50 GB and up to 50,000 files. With the same upload limits, users can also upload files directly through the browser. The CLI and browser uploads offer a seamless way to migrate and contribute data on your laptop into a single place for data science work. Teams can leverage shared, curated data and eliminate potentially redundant data wrangling work and ensure fully reproducible experiments.
License Usage Reporting
To complement the new features of Control Center for data science leaders, we are also launching user activity analysis enhancements which facilitate license reporting and compliance. It offers a detailed view of the level of Domino activity for each team member so that data science and IT leaders can manage their allocation of Domino licenses and have visibility and predictability for their costs. Domino administrators can quickly identify active and inactive users and decide whether they should be allocated a licence. The ability to track user activity and growth during budget planning and contract renewal, makes it much easier to plan for future spending.
In addition to the exciting breakthrough new features for data science leaders, we are also launching a new [Trial Environment]https://www.dominodatalab.com/trial/ to make Domino more accessible. It’s perfect for those who want to try it out and evaluate if it would be useful to your work. The new features in this latest release will be in our trial environment too! This is a quick and easy way to get access to Domino and start experiencing the secret sauce companies like Dealer Tire and Redhat leverage in their data science organization.
Domino 3.5 is currently generally available – be sure to try Domino to see the latest platform capabilities.