
What It Takes to Productize Next-Gen AI on a Global Scale
Summary
Transcript
What does it take to turn the latest advances in AI into products that deliver business impact at Walmart levels of global scale?
Srujana Kaddevarmuth is the Senior Director of Data & Machine Learning Programs at Walmart Global Tech. Her team drives data strategy and grapples with data science productization every day. With millions of employees, hundreds of millions of customers, and petabytes of data at any given moment, Walmart offers some unique lessons in the complexities of building teams, processes, and products to effectively leverage AI at scale.
In this episode, Srujana shares a few of those lessons, along with her perspective on nonlinear career paths, organizational collaboration and alignment, and her ongoing fascination with what’s next. Plus, she dives into her passion for fostering diversity in data science and tech, sharing strategies leaders can implement to help bring more women into the field.
We discuss:
- What to prioritize when experimenting with next-gen tech
- How to use “communities of practice” to align your organization
- Solving governance, reproducibility, and knowledge sharing challenges at scale
- Bringing more women into data science
In this season finale episode, host Dave Cole also shares his three biggest takeaways from his many in-depth conversations with leaders in data science.
Stay tuned for a whole new season of Data Science Leaders coming soon! We're just getting started.
DAVE COLE
Hello, welcome to another episode of the Data Science Leaders podcast. I’m your host, Dave Cole. Today's guest is Srujana Kaddevarmuth. Srujana, welcome to the Data Science Leaders podcast!
SRUJANA KADDEVARMUTH
Thank you for having me here.
DAVE COLE
Srujana is the Senior Director of AI at Walmart Global Tech. Today, Srujana and I will be talking about next-gen AI. We'll be talking about data science at scale. As you might imagine, Walmart has a ton of data and women in data science. That's a topic that we've lightly touched on in the past but Srujana has done a lot of interesting work that I'd love to dive into, to help increase the number of women in data science.
Let's start at the top. In the Data Science Leaders podcast, we talk from time to time about the balance between doing cutting edge research and experimentation, along with the bread and butter type work: project-based work with our business customers.
As the director of AI, how do you go about delving into these next-gen technologies? What would you tell a peer of yours, in terms of how to go about experimenting?
SRUJANA KADDEVARMUTH
Thanks, Dave, for this very interesting question. Leading the AI charter for a company like Walmart would mean building commercial-grade next-generation artificial intelligence technology capabilities for our existing open retail business, as well as for a new and emerging business in the consumer AdTech space. That would be something like Walmart Connect.
We also look at building some new capabilities in a new and emerging technology space, like for tech monetization, net data monetization through our new ventures, like data ventures as well as Luminate. As a part of this process and journey, we build a lot of these capabilities. Specifically, the team is doing a lot of cutting-edge work in search personalization, building numerous recommendation engines, conversational platforms, and deploying them at scale.
Work is also being done in relation to deploying numerous chatbots so that we can engage customers in a much more inclusive manner to create extensive, exclusive, as well as personalized, customer journeys across geographies and different business domains, to unlock financial value for the organization.
As a part of this journey, we build some capabilities such as visual search and multilingual language translation. We look at semantic sentiment extraction from the product reviews. It requires a lot of image data; using different techniques and capabilities around competitive vision, natural language processing; deep learning; reinforcement learning; using and infusing these capabilities with some technologies that are metaverse-immersed through the Internet of Things and AR/VR capabilities; exploring OCR capabilities for entity and image extraction. All of these different capabilities, in combination with the scale and complexity of data that we handle as well as a unique tech stack that we are using to deploy these products across different markets, make these entire products ‘next-gen’. They're helping us generate intellectual property for the organization, build different products, acquire various patents and also help us try this kind of a detailed disruption across the industry.
DAVE COLE
Wow. I was trying to take some notes as you were going there. Cutting edge AI, computer vision, deep learning, chat bots, visual search—you're pretty much hitting on all of the cutting edge data science areas.
Tell me a little bit more about your day-to-day routine. There's a lot that you can do and maybe you're doing all of these things but, at the same time, something tells me that you have to prioritize. How do you determine what area you want to experiment in? Help me understand the roles on your team. How are you able to delve into these cutting-edge technologies?
SRUJANA KADDEVARMUTH
There is a need for cross-pollination across the organization, especially when you're talking about these technologies. With Walmart we are talking about a huge scale. We have multiple decentralized teams doing a lot of work, building different AI capabilities. As an organization we are focused toward building the ecosystem for leveraging, reusing and executing the overall go-to-market strategy, as well as our time. There's a focus on engineering excellence and using tools for improved transparency of what has been built as baseline models, so that we can pick and customize that rather than every team doing all of these things from scratch. It's not necessarily a very prudent use of energy or resources. There's a huge focus in terms of leveraging and reusing.
There's also a focus around cross-pollinating these ideas across the organization and getting teams to collaborate in a cross-functional manner, to achieve positive outcomes. We have various avenues to do that. We are building certain domain-specific communities within our organization. We call them communities of practice. We have communities of practice around data science and governance as well as engineering. These communities provide an avenue, a platform, for us to share some of the interesting work from within our teams and cross-pollinate these ideas. We also talk about some of the roadblocks and challenges, technological impediments that we may have, and how to remove these roadblocks. How do we brainstorm interesting ideas? There's also a committee called the Data Science Council wherein we have people from different walks of the industry.
We have people from product engineering and a CX workspace. They are the ones who actually have a first look at these products that we are building and help us drive adoption of these products across the business. That is extremely important. Prioritizing what’s important for the business and cross-pollinating these ideas are extremely important. We also have other avenues such as internal summits, conferences, book review clubs, paper review clubs etc., all to cross-pollinate ideas and leverage efficiency and economies of scale.
DAVE COLE
Okay, wow. So you have internal communities that bring various decentralized data scientists together to talk. Is that where the cross-pollination happens?
SRUJANA KADDEVARMUTH
That's true.
DAVE COLE
You mentioned that a lot of the models are their base models and then those get pushed out to the various teams. Then they can add and use those models. How do you go about doing that? Is it just through meeting regularly? Are people made aware of these models or are there tools or systems that help you with that?
SRUJANA KADDEVARMUTH
There are definitely some tools. We use a model registry wherein we document all the models that are being built. That gives a transparent view to all the data scientists in the organization, helping them understand where those baseline models are. It also helps us look at whether these data sets, that these models are consuming, are governed at their sources. That gives them the first baseline model that can further customize. This helps with experimentation and executing and reading the overall roadmap. Also, there's one nuance that I think is important for us to understand here, which is the scale.
There's complexity that's associated with scale. If you look at Walmart as an organization, it has millions of customers across the globe. We have around 2.2 million employees across the globe working towards creating these unique shopping experiences for our customers. The data that gets generated is just humongous. It runs into petabytes at any given point of time and it is heterogeneous, messy and non-intuitive in nature.
Being able to tame this data, an AI leader is expected to not only have a very sound understanding of the AI domain, but also understand the technology stack and bring a product mindset in. This would be to build products based on functionality and usage, not necessarily something that is purely powered by academic research or research by a human. There's a lot of emphasis in terms of democratizing the data, AI capabilities, productizing, data science and AI capabilities.
DAVE COLE
Got it. One way to productize that, I imagine, is starting with a model registry, actually having something that allows people to reuse models that have been previously created, then actually embedding them into the various customer-facing functions or products.
You mentioned a Data Science Council. That sounds like it extends beyond just your standard embedded data scientists themselves, but into the engineering and product teams. You mentioned customer experience as well. How is that done? I imagine that's a mixture of people like our data scientists and then those who are interested in how data science can help their various departments. Do I have that right?
SRUJANA KADDEVARMUTH
Yes, that's right. Data science as a domain itself is highly interdisciplinary in nature. It's extremely important for us to drive an option of these technologies across the business. They build the most novel solutions but if they're not necessarily seeing a great adoption across the business, they are not necessarily successful. If you're looking at the scale of an organization that operates across different geographies, with a humongous amount of data, it is not prudent for us to build one-off solutions to solve one-off business use cases. There's a focus on productizing data science. That is primarily a journey of translating the insights generated from exploratory analyses into scalable models that can power products. It involves nuances of building and deploying the models in the production environment.
Productizing data science helps the organization move up the data and AI value chain. It helps the organization unlock a lot of financial impact. It also helps us as an organization to effectively utilize our human capital. You’re quite right that it's extremely important for us to keep the data scientists engaged with the kind of work that they love the most, not to work on something that's mundane and repetitive. There's a heavy focus on automating because, when we automate, machines take care of these mundane repetitive tasks. Then, by giving the bandwidth for the AI scientists to experiment and do this experimentation at scale, they come up with a lot of innovative ideas and solutions and generate intellectual property for the organization. Apart from this, there is an added advantage of productizing AI and data science that is for the organization that operates at such a humongous scale. It helps when standardizing the tech capabilities across the organization, across businesses and across geographies. It also helps us effectively enhance our models as well as the overall data and governance ecosystem.
DAVE COLE
You've touched on this a few times. Let’s talk about our next topic, which is a nice segue from this topic.
Let’s talk about data science at scale. I think 2.2 million employees, alone, is just a fascinating statistic. We need to expand focus to the number of customers that Walmart has—magnitudes of that, of course—then the amount of data that you're then generating, throughout the organization and various customer interactions and touch points. Even non-customer touch points that you're just dealing with, as you mentioned, are petabytes of data. For other data science leaders who are maybe dealing with data quantities at not quite that size, or maybe they do have data quantities at that size, what advice do you have for them? What are some of the challenges that you see with doing data science with such large amounts of data?
SRUJANA KADDEVARMUTH
One of the very interesting aspects here is to also understand how we train our models and build them on such large data sets. These data sets have millions of variables and being able to look at various considerations is going to be important. That's the suggestion that I would give my peers across different industries who are dealing with such a humongous amount of data.
There are a few considerations that we need to bake in while building these products. The first one happens to be the AI or data science consideration. What do I mean by that? We need to look at the overall architectural design and engineering tech stack. Even at the stage of creating the proof of concept, we have seen that there is a common mistake that people make across the industry. They complete the proof of concept and then start thinking of the underlying tech stack, which would require you to do a major architectural redesign. It would delay the productization of data science as well as getting the products out into the market.
Having this kind of data science and engineering consideration is going to be important. The second aspect is the data governance consideration. We have multiple variables that are going to be used to train these models, that too at scale. It's important to understand if we are governing these data sources. Are these data sources having any kind of a PII information? If so, have we encrypted that? Is there geolocation information? Have we encrypted that? As a data regulatory framework is evolving at a lightning speed and there are new regulations that are coming in, ensuring that there is compliance is important. This is more from the legal perspective and also to build trustworthy solutions that are going to create goodwill with the customers across the industry. So having that consideration is also important. There's another aspect that people should be focusing on, which is the AI consideration, which primarily focuses on phenomena like concept drift.
It’s the phenomenon that machine learning algorithms tend to get smarter over a period of time, with a constant feed of new data. If they're not connected to the constant feed of data, they degrade in quality and that happens pretty fast. One of the common mistakes most people across the industry make is to think of these AI algorithms as similar to that of the software products. In reality they need to be treated differently. This difference has to be baked in early on in the product development and project life cycles to mitigate any kind of risks at a later point of time. The other aspect is around un-Internet consequences. Machine learning algorithms are the statistical representation of the world that we live in. They come up with some sort of solutions based on the data that's provided to them but they're not necessarily perfect all the time.
They have this inherent tendency of perpetuating and amplifying the biases that are present in the data and in society at large. Being able to identify this is going to be important, especially if you're building some technologies like recommendation systems. For example, we focus on building recommendation systems a lot. There is always a focus on making sure that there is no hyper-personalization that's happening because that leads to a filter bubble. It keeps reinforcing the same interest and belief systems within the customers. It’s not necessarily good because people think that they're getting the representative view of the world. In reality they are not. That's not necessarily good, so keeping in mind all of these considerations are important to build the models, train them on these large data sets and scale them effectively.
DAVE COLE
Hyper-personalization: I haven't heard that. Help me understand a little bit more about what you mean by that. I can guess but what does that mean?
SRUJANA KADDEVARMUTH
I can give you an example. If you are always opting for a particular brand, or watching particular content on the site, my algorithms are going to show you similar content or the same kinds of brands. However, that is not necessarily giving you a representative view of the world. There may be something out there that should also be baked in to give you a representative view of the world.
Personalization is good, but if you go ahead with hyper-personalization, you’re reinforcing the same interests and belief systems. Eli Pariser came up with the words ‘filter bubble’ for that. Being AI experts building in these recommendation engines, people need to be aware of these unintended consequences and bake them in, while making sure to develop scalable products that touch multiple sections of society.
DAVE COLE
That makes a lot of sense. I think we've all encountered that. If you look at my newsfeed, it can be focused too much on sports. Yes, I love sports, but I love to learn about other things too. Sprinkling those in, allowing me to be a more diverse reader, would be helpful to me. Going back to something you said, I think hearing you talk about doing data science at scale. There was one real important takeaway that I heard from you: when you're dealing with such large amounts of data, you really have to be very thoughtful.
It can be kind of dangerous to do a POC on small amounts of data. Imagine the business is anxious while saying, "Okay, let's go ahead and put this model into production." If you haven't thought about whether or not it actually can scale, and if you did it in a quick and dirty way, you might be months and months away from actually being able to put that into production. Even worse: you think that what you built actually is going to scale and you're incorrect. That could really delay the model from having an impact. Then, data governance as well: that feeds the theme of avoiding rework.
If you are using a data set and it hasn't gone through the proper governance channels or been encrypted, to obviously PII, again, you're going to have to rework. Is there any recommendation you have to ensure that these things are in place? Do you have reviews from early on in the experiment and experimentation phase of the data science life cycle? Is there anything that you would recommend, to avoid these pitfalls?
SRUJANA KADDEVARMUTH
There needs to be a proper process that needs to be established in phase. It’s also important to ensure that there is effective model governance, as well as operational excellence. That's baked into this process. We have a lot of these reviews that are done at every phase of the product development and project life cycle, wherein we are transparently discussing a lot of these problems. As data scientists, we are always challenged because there is always an inundation of new technology tools and platforms. It's important for us to be very transparent and look at all of these as learning opportunities, rather than as any kind of failure or risk.
As leaders it's important for us to exercise a little bit of a risk appetite, overcome these challenges in a sustainable manner and ensure that we keep up the team morale and build a heck of a good team.
DAVE COLE
When you're talking about next-gen AI, you have to have a certain amount of a risk appetite. You have to have an experimentation mindset and, like you were mentioning, a great way to keep your team energized. I have a belief that if you're not learning, you're leaving. Data scientists should be naturally curious people and they might be looking for their next problem or challenge. Taking some of those risks allows them to be on the cutting edge and allows them, hopefully, to continue to learn. Before we move on to our next big topic, there was something you said in passing. You mentioned data scientists, obviously, but you also mentioned an AI scientist. Is that actually a separate role at Walmart?
SRUJANA KADDEVARMUTH
Yeah. Data science is a progression of analytics and AI is more focused on ML engineering.
DAVE COLE
I got you.
SRUJANA KADDEVARMUTH
It's all about scaling. There's a huge focus on building something that can be scaled. Looking at unconventional methodologies and scaling? That's what we call that particular category and the job family as AI scientists. The job families across the industry are not necessarily very streamlined. The talent, in terms of what is needed to do a certain role, is going to be important as we think of building these teams and acquiring them, as well as growing them, in a long-term manner.
DAVE COLE
I would imagine that the ML engineering team is, again given the scale, a very critical team to Walmart's success. Is that true? Is this a team that is seasoned? I have to imagine that some of the best and brightest are on that team.
SRUJANA KADDEVARMUTH
Yes, absolutely. There is a need and this is a very important and critical team. We also have certain hubs of these teams because many times there are a lot of these deployments happening across the organization. We have limited resources. It's extremely important for us to make sure that we prioritize and invest the right amount of human capital for the right kind of projects, which are of strategic importance to the organization. There's a huge focus in terms of being very meaningful: how do we deploy and utilize human capital while being intentional in growing this team, so that they get a lot of opportunities across the organization? We want them to have opportunities to move and navigate and explore growth opportunities internally within the organization as well.
DAVE COLE
Great. Let's move on to our last and final topic, which is women in data science. Maybe you can tell me a little bit about your story. I'd love to hear how you got into data science and then I'd love to hear you be candid about some of the challenges that you faced. Looking at the sheer numbers of women in data sciences as a sort of percentage, it's way lower than what we'd like to see. The number that I've quoted in the past is around 19%. I think that's going up a little bit, thanks to people like you, but what are some of the challenges that you have faced?
SRUJANA KADDEVARMUTH
Yes, absolutely. I can talk a little bit about that, but I would also want to kind of bring the attention of the audience to some of the statistics that you mentioned. First, talking about why we need more women in data science: women in tech statistics indicates that data science, as well as the emerging technology domains, are very highly male-dominated. Around 80% of the jobs in the technology domain are held by men, with only 20% being held by women. Only 37% of tech startups have at least one female member in the board of directors, and stats reveal that only 26% of the computing-related jobs are held by women. Out of that, only 3% are held by African American women, 6% by Asian women and 2% by Hispanic women. The situation was getting better.
However, with the onset of the pandemic, the situation has gone for a toss primarily because the pandemic itself exhibited a kind of gender inequality and digital divide across society. CNN Business Insights recently quoted that around 617,000 women were displaced from the workforce in the month of September 2020, as compared to 71,000 men. These women who were displaced from the workforce were in the prime working age of between 32-45 years. It's extremely important to make sure that we bring and bridge this kind of gender digital divide, as well as gender gap here, primarily because one of the issues that we are facing is that the pandemic has continued to exhibit inequalities in corporate America. If not displacing women from the tech workforce, they are being at least emotionally penalized because they are mostly the caregivers of the family.
It has taken a toll on their emotional and psychological wellbeing. Emerging technology domains are influencing almost all forces of our life, transforming almost all industries that we could possibly think of. When we have these technologies being used by almost all sections of the society doesn't it make sense for us to have an equal representation amongst the people who are actually building this? That's one of the very important reasons that we need to have gender equality and parity in terms of building these technologies. They need to be sustainable and used across different domains. In terms of my experience, I wanted to be a doctor. Fortunately, or unfortunately, I happened to choose engineering for my graduation and pursued this particular domain. Data science was not a very popular field at that point of time. We had operational research, analytics and a lot of things. We were open to these nonlinear opportunities that actually helped me land a better position.
Being open to these new technologies is very important. That's what I would suggest to the newcomers. With all the information that we have at hand, we will still not be able to predict the next greatest technology that will be available in the next few years. Being open to these nonlinear career paths is going to be important as we explore and experiment more. In terms of what we are doing to bring in more women in the data science and domain, there are multiple initiatives that are consciously being taken by multiple people across the globe. One of the initiatives that I would like to bring everybody's attention to is WiDS: women in data science. It was initiated by Stanford University with multiple partners across the globe. It has now become a global movement with around 100,000 participants across more than 100 countries.
It happens to be a data fund initiative wherein we call for a data fund challenge based on social impact. This helps data science enthusiasts across the globe to hone the required data science skill set. One of the requirements is that at least 50% of these teams should be women. It encourages women’s participation in learning and honing new skill sets. There will be mentors assigned through these journeys as well as a lot of resources provided. At the same time, it’s contributing to social change. It's a very fulfilling experience. The second part of this initiative is the conference. As a part of this global conference, we celebrate a lot of luminaries in the data science field, who are women and create role models for the younger generation. The third aspect of this is the podcast series wherein we talk about some of the interesting parts that luminaries and female data scientists across the globe have taken, and some of the challenges that they face.
How did they overcome those challenges to inspire more younger women, to pursue STEM career opportunities and excel in the domain? The last component is the educational outreach program wherein we are trying to capture the imagination of the girls in the higher grades of their education, so that they can explore the potential of the STEM courses and enter this domain. One of the recent statistics or studies by McKinsey and Company indicated that corporate America is at a crossroads. Decisions that the corporate leadership would make today will have consequences in gender equality for years to come, which is true. While there are a lot of opportunities, the future of women in data science or women in tech would primarily depend on the tech industry's ability to attract more girls to study STEM courses, enter the STEM domain and have a fruitful career, by creating the right ecosystem. Doing this requires not only encouraging them to join the corporate sector, but also the development sector so that we can create a better socioeconomic impact.
DAVE COLE
For those of you who are still asking why this is important, there's plenty of studies that have shown that a more diverse data science team leads to better outcomes. It's that simple. I think the main reason for that is that you get different opinions and so much of data science is experimentation. It is trying new ideas and looking at things from different angles. You also mentioned something that was interesting. Yes, there's the Women in Data Science conference, there's WiDS, which would be a great thing for other data science leaders out there to participate in. You used the phrase ‘nonlinear’ career paths. Tell me what you mean by that because I think that's another important way.
SRUJANA KADDEVARMUTH
So many times when you talk about nonlinear career paths, you find it's engineering, being a doctor or a lawyer. So there are a lot of those offshoots where you can experiment and explore, which are intersections between different domains. Right? Those are nonlinear career paths, according to me, because they're not necessarily fitting into a particular well-defined conventional boundary of one domain versus the other. There are all of these different intersections, especially with robotics, Internetal things, metaverse immersion and all of these technologies. There's a high potential for these interdisciplinary technologies. Being open to these opportunities and pursuing a career in that space is also going to be an interesting and exciting journey for most of us.
DAVE COLE
Yeah. It's not just being open to individuals getting into those, moving laterally, maybe starting in a pure engineering role then becoming a data scientist—or even maybe starting in the world of bio and then moving into being a data scientist. It's important for data science leaders to be open to hiring folks like this who are maybe trying to change careers and move into the data science realm. They’re taking night courses or classes to try to build up that expertise, coming from a non-traditional, pure data and analytics background. That's one obvious and easy way to increase the diversity of your team: opening your mind. You don't need to have a team full of PhDs in stats that have gone down this path for years and years and years.
SRUJANA KADDEVARMUTH
Absolutely.
DAVE COLE
Well, Srujana, I really enjoyed our conversation today. I think that we touched on a lot of important topics. It sounds like you have an exciting job solving some incredibly challenging problems. If people want to reach out to you, could they link up with you on LinkedIn or on social media?
SRUJANA KADDEVARMUTH
Yeah, absolutely. You can reach out to me at Srujana Kaddevarmuth on LinkedIn. I'd be more than happy to have a conversation. I would love to engage because I enjoy fostering communities of data science professionals and I would be happy to learn mutually as well.
DAVE COLE
Fantastic. Well, thanks for joining us on the Data Science Leaders podcast. I really enjoy this, Srujana. Have a great day.
SRUJANA KADDEVARMUTH
Thank you for having me, Dave, take care.
—
DAVE COLE
Hi, everyone. Dave here with a quick programming note before you go. This is our final episode of season one of the Data Science Leaders podcast! We'll be taking a bit of a break before launching a whole new season with even more conversations with fascinating data science leaders, like Srujana.
Our next season will sound a little different too because this also marks my last episode as your host. Let me just say that I've had a blast connecting with so many data science leaders, and most importantly, bringing those conversations to you, our listeners. I hope you've learned a ton, and I know that I have too.
When I started as host of the Data Science Leaders podcast, the goal of the podcast was always to bring you fascinating stories and to help you learn from other data science leaders. I wanted the audience—who might be aspiring data science leaders or existing data science leaders, or even folks who just were trying to learn a little bit about how data science works—to learn from all of our great guests. And I hope that that has happened. No matter why you listened, I hope you've learned as much as I have from all of our conversations.
After nearly 50 episodes, I wanted to highlight just a few of my big takeaways that I've learned from our guests.
The first takeaway is—and this might come as no surprise if you listen to the Data Science Leaders podcast—it's all about the people. I saw this in my own career when I was a Chief Analytics Officer, and it’s been roughly 20 years in the data science and analytics world. But finding the right talent is extremely difficult, and hiring that talent can be difficult as well. And we learned from many of our guests about some creative ways in which to do that.
One of the things that I recall as a theme that's come up time again, is talking about diversity in terms of the type of backgrounds and the type of skills that you bring to your team. It was rare when I talked to a data science leader that they had a team of data scientists that were strictly PhDs or statisticians. Sure, having a PhD in statistics or even data science these days is fantastic, and can be extremely helpful to your team. But almost all the teams that I talked to and almost all the data science leaders I talked to had a widely diverse background of folks. Some folks who came from the business side, some folks who came from more of an engineering developer background, but were interested in statistics, and then, yes, again, some folks who definitely focused on data science.
A couple of episodes in particular that stand out in my mind around building and managing teams are our episode with Vikram Bandugula at Anthem, and another with Jan Neumann at Comcast. Like so many other data science leaders I spoke to, they both had some really valuable insights on building high-performing and diverse teams.
One other key takeaway for me was there's always an emphasis on strong collaboration between the data science teams and the business. There are a variety of models there as well. In some cases, the data science teams were embedded within the various business units. And in other cases, they were more centralized, sometimes in a center of excellence type model. And in other cases, you saw a bit of a hybrid. Data scientists would be part of a center of excellence and then would go on rotations on the business side.
The other aspect of partnering with the business that I found very interesting, is doing it early and often. When you look at the data science life cycle it is very different from software, as we know. There's a lot of experimentation. And taking your business teams along the way in that experimentation and educating them that it's very difficult just to say, that a particular project is going to be done in X number of days or Y number of weeks.
I think one interesting takeaway I had from Indrasis Mondal at DocuSign, was he thought it was very important for a data science leader to not just help build models and help the business make various predictions along the way that might help out users. But also, to help out strategically. He saw the data science leader as really having a seat at the table when it comes to business strategy and looking ahead.
And I think that was very aspirational, but I also think it's very real. I think decisions that are made with data and evidence are usually the best decisions, especially, when you're in the world of data science and you're looking at actual predictions as well.
I think the third and final takeaway is that when it comes to the potential of data science, I think we're really just scratching the surface. I saw data science leaders really focusing on wanting to change the world, and I think we're just getting started.
One of the most fascinating parts about talking to data science leaders across so many industries is learning the variety of ways in which machine learning can solve real world challenges. Every guest this season had a unique story to tell.
Susan Hoang at McKesson gave us a look at her team's work in oncology analytics, using data science to identify promising cancer treatments and provide critical feedback to medical teams. Mark Teflian at Charter brought us back to the earliest days of COVID-19 lockdowns. He shared with me how his team quickly mobilized and built models so that millions of Americans could stay connected online during the pandemic.
And then, Takuya Kitagawa from Rakuten Group shared his passion for using AI to better understand human intention, so we can design products and experiences that support our collective wellbeing. And last but not least, Robin Foreman from CVS Health took us into the world of data science applications in clinical trials, where models could literally help give patients a new lease on life.
There are so many stories like that, and I feel like we're really just scratching the surface and just getting going here. As data science leaders, it is our job to build the teams, the frameworks, and the processes that unlock real game-changing results. That's what it's all about.
So on that note, I want to thank all of our incredible guests from the season who shared their expertise and perspective. And most importantly, I want to thank you for listening. There's so much more to come on the next season of the Data Science Leaders podcast, and I hope you'll tune in. This is Dave Cole, signing off!
Popular episodes
Help Me Help You: Forging Productive Partnerships with Business...
Change Management Strategies for Data & Analytics Transformations
A Hybrid Approach to Accelerating the Model Lifecycle
Listen how you want
Use another app? Just search for Data Science Leaders to subscribe.About the show
Data Science Leaders is a podcast for data science teams that are pushing the limits of what machine learning models can do at the world’s most impactful companies.
In each episode, host Dave Cole interviews a leader in data science. We’ll discuss how to build and enable data science teams, create scalable processes, collaborate cross-functionality, communicate with business stakeholders, and more.
Our conversations will be full of real stories, breakthrough strategies, and critical insights—all data points to build your own model for enterprise data science success.