Data Science at a Fortune 500 Global Financial Services Leader
Financial services organizations worldwide are using data science to slash underwriting times, personalize offerings, and redefine the customer experience. But to achieve the full value of what data science can do, organizations are finding they need to take data science from a capability to a first-class function.
A Fortune 500 global financial services leader uses Domino to develop and deploy new models with greater efficiency and complete reproducibility. Domino provides a centralized platform that brings the company’s distributed data science team together in a collaborative way so they can operationalize data science at scale across the organization. Newly enabled use cases are enhancing call center training, ensuring all customer communications carry the right tone, and improving personnel recruitment processes to build a great workforce.
This global financial services organization is transforming every part of its business using data science—from underwriting to customer service to human resources (HR). For example, to build a great workforce, HR staff work hard to find the right people for each job. However, with dozens of job openings at any given time and a limited number of recruiters, hard-to-fill openings can easily get lost in the shuffle among newer job listings that are top of mind. To enable recruiters to better prioritize which job openings to focus their efforts on in a given week, the company is using machine learning to analyze recruiting data (for example, the number of applicants for a particular opening and recruiter schedules).
Given their potential impact, getting such capabilities into the hands of business users quickly is critical, and the company’s Analytics COE has made it a priority to reduce overall time for development and deployment of models. The COE needed a platform that would streamline workflows and improve collaboration among its data scientists spread across different geographies. This included:
- Ensuring code consistency and project shareability. For each project, data scientists had to determine at the outset which environment (SAS, R, or Python) would be best suited to run the analysis. Each coding environment was located on separate servers, making it difficult for data scientists to share code or change tools if they wanted. If somebody was out of the office, either planned or unexpectedly, suddenly a project would have to stop while that person was gone.
- Streamlining access to data. As a federated global organization, the company has data stored across many platforms worldwide, including Oracle, IBM, SQL Server, Postgres, MySQL, Hadoop, and log servers. The team wanted a common location for pooling data and making the datasets easily consumable by data scientists, without compromising IT security requirements.
- Simplifying deployment of new models. The company’s disparate development environments were isolated from operational systems, complicating efforts to embed models into business processes. Rather than spending cycles to retrofit those environments to become API serving locations, the company wanted a platform that would enable it to easily serve analytics results into a production process.
The Analytics COE brought together a team of code-first data scientists with a do-it-yourself mindset and IT staff to evaluate solutions from several vendors, including Domino, Dataiku, H2O, and DataKitchen. As part of their evaluation, the team launched a proof of concept (POC) demonstration, using the Domino platform to build machine learning models that:
- Evaluate customer communications to help senior managers and marketing staff ensure all messages carry the right tone before sending.
- Analyze call center transcripts for both customer sentiment and topics to enhance staff training. Managers use insights gleaned to identify situations where agents are handling particularly difficult calls well so they can distill best practices and develop specialized training.
Based on the RFP and the POC, both the data scientists and IT staff felt Domino was best geared to meet their needs. For the data scientists, it gave them full control of their code and access to the resources they needed. At the same time, the server team gained a centralized platform for better governance and security that could run on both the company’s current on-prem and future Amazon Web Services environments.
The company has implemented Domino as their data science platform to support research and development of new models. The benefits provided by this platform include:
- Quick and seamless collaboration. Before, data scientists frequently worked in isolation, and it was easy for information sent via email or text message to become disconnected from projects. With Domino, multiple team members can work on the same project using whatever tool(s) each person prefers. They can easily find, view, and share projects with colleagues, annotating their work directly in the platform, so comments aren’t lost over time. And they can access common patterns and pre-written code snippets to quickly connect to key data sources. All of which provides a tremendous productivity boost.
- Complete reproducibility. Being able to recreate scenarios becomes increasingly important as the company operationalizes data science to tailor services based on customer analytics. The Domino platform supports the company’s model governance framework, automatically tracks code, tools, data, and packages so that data scientists can easily recreate results and build on past work.
- Faster model deployment using Domino to create model application programming interfaces (APIs) and enable business staff to access analytics results on demand. Once the company moves to a cloud environment (in process), it will have the elastic compute power to deploy all its models on Domino.
The organization has taken a phased approach to their Domino implementation, initially rolling the platform out to 15 data scientists within the Analytics COE. Once the company transitions to a cloud-based environment, it will expand access to an additional 60-plus data scientists embedded in the company’s business units. To this end, corporate data science training classes now use Domino to instill best practices.
The Domino Effect
- Saving employees hours of work each week. Augmented decision-making tools are helping company business staff more quickly and accurately determine the next best step in a given process. In the case of the company’s HR team, using a new optimization engine, built on Domino, recruiters no longer spend the better part of their morning going through a backlog list of job openings to figure out what requires their attention.
- Inspiring greater innovation. The ability for COE data scientists to share and build off past work has helped spark greater creativity to drive new models that will directly enhance the customer experience. Their successes are now inspiring embedded analytics teams.
- Increasing employee satisfaction as now data scientists feel more part of a team while still being able to work independently.
- Decreasing model run time from days to less than an hour. Domino has enabled data scientists to scale up more powerful compute resources, dramatically reducing model run times. Model run time for one real estate portfolio optimization model went from days to less than an hour once on Domino.