
Modernizing Healthcare Through Data Science and Digital Transformation
Summary
Transcript
In healthcare, only 14% of scientific discoveries actually make it into clinical practice. But data science, in lockstep with the digital transformation, is helping to change that.
As healthcare data and clinical studies transition to digital form, the opportunity to use data science and AI to generate insights and recommend treatment pathways is greater than ever. And the ability to make healthcare delivery more equitable is within reach.
In this episode, Kaushik Raha, VP Data Science & Health Content Operations at Elsevier, explains how data science is transforming the healthcare industry. Plus, he shares his thoughts on bias and some best practices for operationalizing data science.
We discuss:
- How data science is helping to modernize healthcare
- Working with clinical analytics to root out bias
- Advice for operationalizing data science
DAVE COLE
Hello, welcome to another episode of the Data Science Leaders podcast. I am your host, Dave Cole, and today's guest is Kaushik Raha. Kaushik is the VP of Data Science at Elsevier. Kaushik, first of all, welcome to the Data Science Leaders podcast!
KAUSHIK RAHA
Thank you, it's great to be here with you Dave. Thanks for inviting me.
DAVE COLE
You're the VP of Data Science at Elsevier. You've been there for three years or so. You have 15+ years of data science experience in the pharmaceutical and healthcare area. You also have a PhD in computational chemistry, but let's start with Elsevier. What is Elsevier? I'm not familiar with it, personally.
KAUSHIK RAHA
Sure. Elsevier is a global publishing company. We started out as a publishing company but, at this time, Elsevier is transforming into an information analytics company. We have our roots in publishing where we have published very well known books and journals. I'll give you an example. The Lancet, for example, is an Elsevier journal and so is Cell. Dr. Fauci held up a paper from The Lancet, if you can remember.
Now that we are transforming into an information analytics company, we want to take our content, extract the data and generate insights from that data. We serve many different sectors. There's the research sector or there's also life sciences and the pharmaceutical sector. What I'm, right now, in charge of is data science within the healthcare sector for Elsevier.
DAVE COLE
Perfect. Well, that brings us right to our agenda, because the first topic we're going to dive into is data science and healthcare. I have spoken with other data science leaders within the healthcare industry. Because you service the broader healthcare industry it would be nice to get your lens on the state of healthcare and how data science plays a role there.
Also, we're going to be diving into another topic that is near and dear to your heart, which is bias in data science. That comes up from time to time, but we're going to really dive deep into it in this episode. Last but not least, another recurring theme: operationalizing data science. We'll also be getting your thoughts on that too.
Let's start off with data science and healthcare. How do you see data science? First of all, Elsevier, you mentioned, is transforming into an information analytics company instead of being a pure-play publishing company. Can you talk a little bit about that transformation effort?
KAUSHIK RAHA
Like I said, we have our roots in publishing, but now what we want to do is take the content that we own as a publisher and essentially drive outcomes and decisions. These could be decisions in research or healthcare. What we want to do is to introduce decision support: leveraging our content, extracting the data and insights from the data then driving outcomes with those insights.
In all of this, the underpinning is data science and artificial intelligence in the future and even now. The reason for that is we can only be successful in this if you're able to leverage data science and AI technology. We have a lot of content. We want to extract the data from the content and not just use that data, but also marry it up with other datasets.
For example, in healthcare you can think of patient data or electronic health records and medical records: how can we combine these different silos of datasets and essentially support decision-making in the clinic or in scientific research? The only way you can do this is by leveraging data science at this point today.
DAVE COLE
Right. Elsevier, I imagine, has lots of content. It has traditionally been a publisher. A lot of the analytics that you're doing are probably NLP-type use cases. Is that right? You said marrying that up with patient data, electronic health records. That is probably not data that you have, that’s data that your customers have and bring to the table and you help marry the two?
KAUSHIK RAHA
Yeah, exactly. That's the data that our customers have. We have products where we interface with the customer and they bring that data to us. We bring in our expertise in content and data products that are coming out of it. I'll give you an example, one of our products is ClinicalPath, which is essentially a decision support tool for oncologists who are caring for cancer patients.
It gives the pathway for treatment for cancer patients, to the oncologist. What we do is record whether they have been on pathway or if they're going off-pathway. We basically give them the pathway for treatment but also the rationale for that, which is supported by the evidence that we have extracted from our content.
What we also get in return is the patient journey viewpoint from the provider, from the cancer centers or whatnot. That's where we want to do the analytics to make the product better: how can we make this pathway even more effective or robust and lead to the best outcome? We want to capture the patient journey, the end of the journey or the outcome, and connect it back to the data that we're getting from the consultant.
DAVE COLE
Right, so you're getting some of the real world data from your customers. You mentioned the pathway to treatment data comes from much of the published work that Elsevier has collected over the years in terms of the typical pathway to treatment for a specific type of cancer, for example.
KAUSHIK RAHA
Yeah, it's a little more than that. In addition to that, there is a group of experts who synthesize these pathways. We enable that as well by providing the evidence for it. It's not just a data product that is automatic. There's a lot of oversight because we are talking about life and death decisions here, so we want to get it right.
Enabling creation of these pathways is where we do the data science, where we're extracting or curating the content, getting the evidence from there and marrying it up with patient data or other datasets that we get from the providers and social data. You also want to tie it up with social determinants of health. Elsevier’s parent company, RELX, has a lot of that data as well. That's where me and my team operate, in the intersection of these different areas.
DAVE COLE
Right. What you're actually doing is providing a service, right? You've created a product, I assume, essentially for the various customers that you have. Most of your customers are in the healthcare industry so they're healthcare companies as well? Okay, you're nodding your head.
KAUSHIK RAHA
Healthcare providers, just to make a distinction. We don't do much with payers. It's not insurance companies but we do have the leverage to get in there. Also pharmaceutical companies and life science companies who are developing drugs and medicines, and want to understand the clinical trial space, where can they plug into clinical trials that they might be running based on our data products or services. It is that spectrum of customers: starting from providers, maybe also pharma companies and life science companies looking to develop medicines.
DAVE COLE
Got it, got it. So let's go back to the original question. Just from your catbird seat, in working with various customers, what do you see as some of the biggest challenges facing the healthcare industry today? And what role do you see data science playing?
KAUSHIK RAHA
I think I'll frame this question in two parts. One would be the pre-COVID, and one is the post-COVID world, right? Even before COVID, the healthcare industry was facing challenges. COVID brought about its own set of challenges. At least from my vantage point, I think there are four or five main challenges. One of them is the burden of medical errors.
Today there are a lot of diagnostic and therapeutic errors which lead to adverse events. There's a statistic there which is numbering, actually in a recent study, every year 43 million patients are injured due to these adverse events. That's avoidable. That’s huge.
DAVE COLE
What is an adverse event?
KAUSHIK RAHA
An adverse event is something that has happened as a result of a mistake in a diagnosis or therapy. For example, there was a misdiagnosis or there was wrong medication prescribed, or a wrong procedure that was done. That's an adverse event and that leads to patient injury and is absolutely avoidable. So these statistics are pretty mind-numbing. I think that's a big burden for healthcare, a challenge for healthcare. The other one, I would say, is the data explosion that we're seeing right now in healthcare. Healthcare data is increasing faster than any other industry and the projected growth is almost like 36% compound annual growth rate through 2025. This is definitely big data. I think the problem with healthcare in general is that it's fragmented and siloed.
You have many health systems, payer provider systems with their own data silos, and there is no way to connect them, at least not easily. That makes wholesome insight generation or decision support quite challenging.The third one, I would think, is the information explosion. From my point of view, it's the explosion of the scientific literature and how it's not being converted to clinical practice soon enough.
Again, according to some statistics, just 14% of scientific discoveries actually make it into daily practice. This presents a unique challenge because we are always talking about evidence-based medicine or evidence-based care, but we're falling so far behind in getting the real evidence out in clinical practice. So that's a big challenge.
Finally, I think the model of healthcare delivery has changed, right? It has gone to value-based care from a fee-for-service model. That presents its own challenges. Healthcare providers are being asked to do more with less. They're being asked to spend the least amount of money but improve patient outcomes.
Given all of the above, I think that becomes even more of a challenge. I will say post-COVID, we are faced with a whole other set of challenges, right? There's digital transformation, telehealth, how you manage supply chains for manufacturing drones or vaccines and whatnot.
There are definitely more considerations around privacy, personal information and cyber security. I think all of it makes it quite a mix right now in the post-COVID world. Coming to your second question about data science, I think it plays an important role and will continue to do so. I might be biased but I can't think of any other tools out there which we can use for managing, organizing and generating insights, except for data science and AI.
Because of the volume explosion and the data to use, that we have right now, I think data science has a role to play. It will be playing a role in digital transformation all across the globe, which will hopefully make your healthcare delivery more equitable. Right now we have a big problem that it’s not equitable. Some places or people are getting the best whereas others are left behind. I do believe that data science still plays a role in all of these.
DAVE COLE
Wow, we just got bombarded with a lot of great information from you. I have a few questions. I've taken some notes along the way here, Kaushik. You mentioned the data explosion. There's 36% more data, year over year, today in this year versus last year. Do you know what is causing that explosion? Why is there more data today than there was, even a year ago?
KAUSHIK RAHA
It's a combination of many factors. Digitalization is causing it. Everything is becoming digital, right? Patient records are now digital; x-rays; everything that you can think of is basically going into a database somewhere. That's one aspect. Then we have biosensors, the Internet of Things and healthcare within your home: here's this continuous deluge of things that are measuring things and putting it somewhere in a database.
It's happening at the individual level, right? Imagine what that does—the exponential effect is tremendous. I'll come back to scientific research. Even that is growing in leaps and bounds. When you combine all of this, you can see how it's that number that we are getting to. It's probably going to increase in the near future.
DAVE COLE
I'll give you a personal example, Kaushik. I coincidentally had my checkup yesterday. Doctors, for whatever reason in the medical profession, tend to be fairly old school. They're paper-based. Getting them off of paper and to start capturing things electronically—I don't understand why? Somebody else explained to me why that is: they tend to push back. I think it's mainly because of speed, right?
Typing stuff in is more painful than just quickly writing it down. I walked in, I looked behind the front desk and there was a wall of, I believe, medical records in files—just sitting in walls and walls of paper. During my checkup, my doctor’s asking me a lot of these questions that I'd answered years ago, right? “Do you have any history of heart disease in your family? History of cancer?” All this stuff.
I was thinking to myself that I've already answered these questions but I saw him typing it in. He was typing all the things into one of these systems. I was thinking to myself that he was now transforming; getting off of the paper base. Of course I would have liked for him to have gone through the paper and typed it in ahead of time and just confirmed a lot of these things. He was just typing it in for the first time.
I think you're right that there's a lot of that going on. That's what you mentioned, is digitization. The providers are moving into the digital era. Many of our listeners who are not in the healthcare industry might be scratching their head, right?
KAUSHIK RAHA
Exactly, and to be honest, I think the regulatory aspects around healthcare data, privacy and data security have slowed things down. And for good reason, right? It's not like Target or Walmart, where you're going to buy your diapers or whatnot. This is not about your healthcare history. The approach: then you might as well be safe than sorry, and ‘I don't want to be sued down the road for a data breach’ or something like that, right?
You do understand it, but it's also becoming something that is insurmountable. It's becoming a huge issue right now, because you still see doctors doing this or even nurses taking all these notes. At the same time, they're asked to enter these into healthcare records and do that work themselves. That's actually a physician or nurse burden, which needs to be relieved.
That's where, again, data science and AI come in, in a big way. It would be really great if it was seamless, right? When the doctor is talking to you, the digital assistant does that work of recording that, making sure that its quality assurance is there, the privacy aspects are maintained and all that stuff. Then it gets entered into a digital database with a few checks and balances there, for sure.
How cool would that be that even if the doctor is asking these, they are not the ones who have to go and enter this, which is where we are right now? That's where we can do a lot.
DAVE COLE
Yeah, I smell a product there, Kaushik. Something like a speech-to-text one, where the answers to these questions make their way into some structured dataset.
KAUSHIK RAHA
It’s like dictation software that's out there. Actually, Elsevier is also partnering with some companies where we do go a step further and can also provide suggestions for diagnosis or tie it up with patient history. It’s a clinical advice type of platform that we are building. There's a lot going on there, and it's an exciting area.
DAVE COLE
I mean, this is a transformation that is a long time coming, and I'm glad to see that happen. I'm glad to see that you're hoping to make that happen. One last thing, statistically, you threw out that 14% of scientific discoveries actually make their way into general practice. I mean, first of all, that’s terrifying, right? To think of me as a patient, that if there's some discovery that's already happened and my doctor is not aware of it, and it's not part of their standard practice, that's very scary. What do you think the cause of that is?
KAUSHIK RAHA
It goes back to the burden of keeping yourself informed and abreast, right? How much do you expect the doctor to do? Not to throw statistics out here, but apparently it will take 21 hours for a doctor, on average, to stay abreast of the literature on a daily basis.
Think about that number, right? How are you going to do that? That's one thing and the other aspect here is that, and you mentioned this, doctors are set in their ways. They have done their clinical practice. They believe that the experience is all that matters. They've seen other patients and I mean, they don't need to know anything else. I mean, it’s in rough terms, but to be honest, that's an issue.
There is inertia there, right? I mean, if I know this works then why should I go and read a paper or try to find out? We try to bridge that gap by, again, extracting the evidence and presenting it in a very contextual manner to the physician or clinician when they're actually seeing patients. How can you get that latest scientific research to clinical practice as quickly as possible?
DAVE COLE
It is data science absolutely going out there and ideally, in a concise way as well, helping to curate and make their lives a lot easier in a contextual way. That makes total sense. Let's switch gears a little bit, and I want to move on to our next agenda topic, which is talking about bias in data science. We mentioned that in the post-COVID era, the security and privacy concerns are all important.
Bias has always been something that data scientists need to think through, certainly in healthcare-related use cases. Talk to us a little bit about how you see bias playing a pivotal role, how you've addressed it personally within your own team.
KAUSHIK RAHA
First of all, let's just define what bias is, right? Recently I saw a really great essay by Daniel Kahneman, a Nobel laureate, who defines bias as ‘predictable errors that cloud your judgment and take it in a different direction’. Basically, an average of errors is bias. If there's directionality, there's bias. If there’s no neutrality, there's bias, right?
Bias is important to address, especially in healthcare. In my field it has a network effect. What we put out there is used by clinicians, students who are learning to be doctors or nurses. If bias exists in those products it is really insidious and can end up doing a lot of harm. We have to be very careful that our products don't have bias.
How do we address bias? It's actually a hard problem. One thing that we want to do is make sure that bias is not there in our content in the first place, right? Our data products are essentially derived from our content, so if there's bias in our content then that's something that we need to take out. Bias comes in different forms: explicit and implicit. Implicit bias is harder to uncover and correct for. Explicit bias is somewhat easier.
DAVE COLE
Define that for us, if you don't mind. What is explicit versus implicit bias?
KAUSHIK RAHA
Explicit bias would be something where you're using explicit categories or terms in your description of a thing or a decision that you're making. In the context of the criminal justice system, I can give you many examples, like targeting ethnicity or race. Your decision has that very explicitly in there, right? That's explicit bias.
Implicit bias is where that's not defined clearly, but there's still bias. It's in the way that something's being said without actually mentioning categories or ethnicities and so on and so forth. It's the way a study has been designed if you want to do some research. Let’s say that you've designed a study that is biased in nature, right? The outcomes are going to be biased and perhaps you are biased. That's why you are doing that study, right? Those are implicit bias examples that you have to be aware of.
We have to be very sure that bias is not in our content, both in explicit and implicit form. This could be question banks that we are selling to educators for training physicians, or nurses, or case studies that are used for clinical student support. We comb through our content and try to find bias in a retrospective manner, as well as build these tools that can do this in a prospective manner.
DAVE COLE
Are you saying that your team of data scientists are actually looking through, trying to identify biasing questions, using that and trying to flag that in your models in some way?
KAUSHIK RAHA
In existing content, yeah. They make sure that the content that is being used for the modeling purpose doesn't have that bias. I'll give you an example. I mentioned explicit bias. We can do a keyword search type of strategy, which is much simpler, and then identify where explicit bias might be. Then we have a phase where SMEs (subject matter experts) look into that and look for whether there's real bias or not, or whether there is implicit bias or not.
Then we create these goal or validation sets. Then we can build the ML algorithms of those datasets. That's how it works. It's actually a very challenging task and needs a lot of SME support. Not only is there bias here, there's another dimension of subject matter itself, which is very complicated. This is medicine, surgery or nursing. These are not easy. These are highly specialized topics, right?
You're looking in two dimensions here for bias in highly specialized fields. The other one would be how to address your representation from different groups. The balancing of the datasets that are used for model-building is very important. We have to do this activity and that's how we address bias in my team.
DAVE COLE
Trying to summarize a bit here: you have explicit bias where that would be using gender, age or ethnicity explicitly in your models. You’re trying to root that out and prevent that from being part of your model. Then there's an implicit bias, where it could be a result of the way in which the data was collected. Maybe it wasn't intentional but it just happened to have a skew in some of those demographic categories in some way or another.
In my experience, there's also a proxy. You have to be aware of proxies for certain things. It might not be explicit, right? For example, a particular zip code might have a certain age, gender or ethnic makeup. How do you root out some of those types of bias by proxy? Is it a form of explicit bias or do you consider that implicit? Where does that fall?
KAUSHIK RAHA
I would like to get into the implicit bias category, where there's a proxy. Those are, again, hard to root out. That's where we look for knowledge of the domain and of what the proxy actually means. In the context of medicine, or health, that is even more complex, right? It's not just a demographic database. There's also a healthcare component there. We have social data, SDOH data, which helps with those kinds of investigations.
We can do an SDOH analysis on its own and see what biases there are in the social data, to begin with. When you're using that for model-building, combining it with healthcare data, you can get an idea of what those proxies might be and correct for those, or at least account for those in your models.
DAVE COLE
The second thing I heard you say is, especially in the world of healthcare, when you're dealing with very technical papers, written by PhDs and doctors. As a data scientist, a member of your team can't be expected to understand all the ins and outs of all the various papers. Bringing in an SME can help with that.
These SMEs have been trained in the basics of what to look for when it comes to bias, or do we just bring them in just for their specific expertise?
KAUSHIK RAHA
Absolutely. They have a specific expertise in medicine. Most of the time my team has not just data scientists, but also subject matter experts. At one point, we had more subject matter experts than data scientists.
DAVE COLE
Really? Wow.
KAUSHIK RAHA
Yeah. Now I guess it's like 50/50.
DAVE COLE
I'm curious, what is their title?
KAUSHIK RAHA
They're mostly like MDs or MD PhDs.
DAVE COLE
Right, but what is their role?
KAUSHIK RAHA
The title is clinical analytics specialist or knowledge representation specialists.
DAVE COLE
And they get paired up with data scientists for various projects?
KAUSHIK RAHA
Absolutely. I actually always want SMEs to be in the project, even if there is no available SME component because of keeping them grounded. Keeping the data science scientists near to the subject matter is really important in model-building. My teams always comprise SMEs and data scientists.
The reason is because we get into creation of goal sets, where the SME knowledge is important, and then we kind of pair them up with the data scientists’ work for that purpose. It's the whole idea of active learning where your machine learning model is giving you an output and then you're using that to improve your validation, test or goal set. That's driven by the combination of these skill sets. I forgot the question. You asked for something else though. What was the question that you asked?
DAVE COLE
I was asking about whether or not these specialists have expertise in bias, or they just have expertise in medicine alone.
KAUSHIK RAHA
The short answer is yeah, they have experience in medicine. They do understand bias. The thing about doctors and people who see patients is that they have been in the real world. They have people-facing jobs, right? It's not like they're siloed and sitting in a lab somewhere, doing their own experiments. They do understand the dynamics of bias and how it works in the real world.
DAVE COLE
They almost play your customer role, in a sense. They're looking over the results and asking ‘is the doctor actually going to use this’? That's a very valuable role because, given that Elsevier is creating products for the various providers out there, you may not have easy access, right? It's not like you're a healthcare company, yourself, and have access to them.
I should have started with this question, Kaushik. I just want to wrap up with probably the most important question here around bias and in data scientists. Why does it matter? Why is it so important for your team to root out bias in data science? I know that's a kind of weird question, but I think it's fundamental.
KAUSHIK RAHA
It's a matter of life and death, very simply, right? If you say that you're doing clinical decision support, where you're giving a physician a direction to go, in terms of treatment, diagnosis or procedure, bias in those models means the outcome can be really bad. Even if it leads to a minor injury, is that acceptable?
DAVE COLE
To be blunt, the outcome could be wrong, right? The dataset that you could be using could be inherently biased with the model you have, because it might have been providing them with the wrong recommendation.
KAUSHIK RAHA
Yeah, you might be providing the wrong recommendation. That's one of the biggest reasons. Also, there are categorizations of the products that we have, and we are not going purely into decision support. Then from an FDA standpoint, it becomes a different product, right?
We are more in the business of recommendations and clinical advice, if you will, but the point still remains that if you are giving wrong advice, then it can lead to a matter of life and death. One more point that I'll make is that as we are building these products, we see this inflection where the doctors or clinicians go from not accepting an AI or an animal-based product to completely being dependent on that.
This is to the point that they trust it with everything that they have. That's even more dangerous because at this point you have that trust but then you have the bias, which is going to be really bad.
DAVE COLE
Most of the time we talk about, on the Data Science Leaders podcast, how to get some of the business users in the medical profession to buy in and adopt. You're pointing out what happens at the extreme, where they've already bought in, hook, line and sinker, and they almost trust it over, potentially, their own instincts.
Then suddenly, you can't be wrong. How should they actually use those recommendations, making sure that they are always balanced with some sort of real-world expertise and experience, right?
KAUSHIK RAHA
I do want to highlight this point because we talked about bias. We always think of how you get the buy-in and make sure that your customers have buy-in. How does bias play in there? I think your listeners and everybody else who's practicing data science should be very cognizant of the fact that there's the other extreme, where people are depending on what the algorithm is providing as a decision. That's even more dangerous so we definitely make sure that we get it right.
DAVE COLE
Perfect. This is a great segue into our last agenda topic, which is to talk about operationalizing data science. Are there any tips and tricks that you have for the audience in terms of best practices? Because you've been creating these analytic products for a few years, now, what advice do you have?
KAUSHIK RAHA
One of the key insights that I want to share is to really understand what your customer needs are. Figure out what exactly it is that you want, that you have to operationalize. Is it DataOps, MLOps or AIOps? By that, what I mean is, all these things are different, right? There's a different burden and there's a different expectation that operationalization will carry when it comes to any of this.
For example, I would say that DataOps is probably much less challenging and things can happen asynchronously. Your modeling or data science team can be building the models and your product, especially if it is the right data, which is less of a burden to operationalize. On the other extreme is, ML models that need to be live and that consistently have to perform and have to have really low latency, because the prediction in real time is what the product is, or what has been used.
There the burden of operationalization is a lot higher. For that we have best practices like continuous integration, continuous deployment (CI/CD), which we use pretty regularly and heavily in Elsevier. What we do is make sure that the data science and engineering teams are working really closely. There is this continuous dialogue and everything that's being developed is deployed and tested as separate processes.
For this, there needs to be a very good understanding of what the CI/CD pipeline looks like, not just by the data scientists, but also by the engineers and architects. Everybody has to be on the same page with a very clear definition of roles and responsibilities. One of the things I've noticed is, due to the commoditization of data science, a lot of times engineers will start doing data science and the whole CI/CD pipeline breaks down at that point.
As a leader, I think you have to draw the lines and say, "This is the pipeline, do we agree on this or not?"
Then stick to that. The other thing I would say is to really think hard about what success is. What does success look like from an operational standpoint? Make sure that you have some metrics, which are very general, like what would be the best outcome for this operationalization or production initiative?
Record that over a period of time, so you can compare it across different operationalization initiatives. You know that you're moving the needle or in a very data-driven manner. You're improving your best practices, or what to do as part of operationalizing your model. I think those are some of the key insights that I have for people who are doing this and deploying ML models.
DAVE COLE
Got it. You mentioned that in making sure that clear roles are defined and that the engineers are not actually doing datasets, you actually have engineers who are like, "I'm going to retrain this model, I know what I'm doing. I don't need to bother the data scientists." That's actually going to happen for you?
KAUSHIK RAHA
It can happen. Sometimes it does happen.
DAVE COLE
Well, that's obviously a bad thing. You don't want to have that happen. But who knows? Maybe that's a future data scientist on your hands or definitely a future ML engineer.
KAUSHIK RAHA
Yeah, I call that a career track.
DAVE COLE
Right, right.
KAUSHIK RAHA
But on the other hand, it's not the worst thing ever. I'm just saying that there has to be communication when this is happening. If an engineer is retraining the models, maybe it's perfectly fine to do so. As long as the data scientists and engineers have a good understanding that through this part of this CI/CD process, maybe you can retrain the models when we supply them with new data.
DAVE COLE
User data.
KAUSHIK RAHA
Just so that everybody is in consensus about that view.
DAVE COLE
Yeah, I mean, you started off saying that one of the most important aspects of productionalizing your models is that tight partnership with the engineering team. Some of the data science leaders I talk to actually have ML engineers on their team who are responsible. It sounds like in your situation, they're probably part of another team, maybe part of the IT org or what have you.
Obviously, both models exist out there. Do you have a preference? If you could wave a magic wand, would you prefer to have ML engineers on your team?
KAUSHIK RAHA
Yeah, actually we are moving in that direction. We have some positions that we are hiring for more ML engineering type of roles. They would really be the bridge between your technology and data science teams. That's a good model in my mind. In some ways, we can talk about career tracks and various frameworks that also work out in that way.
A lot of data scientists are very interested in engineering aspects. They want to know how to scale their models. They want to operate over terabytes of data or whatnot. Then there are the engineers who want to move into this other direction and really understand modeling. It works both ways.
DAVE COLE
That's great. You've done great. I've learned a lot on this podcast about how data science is playing an active role in modernizing the healthcare industry and how you and your team are helping that through Elsevier, as a company, make that happen.
We've talked a bit about bias and data science; why it's important and some things to watch out for. And then just getting to those models in production, we talked about the tight partnership between the engineering teams and your data science teams, roles and responsibilities and a number of other things.
I really appreciate you coming on the podcast here, Kaushik. If folks are interested in following up and having conversations with you, can they reach out to you on LinkedIn?
KAUSHIK RAHA
Absolutely, I’d love to have any follow-up conversations or receive any feedback. I love the feedback, and thanks for having me. It was a great conversation and I really enjoyed it, myself.
DAVE COLE
You bet. I really appreciate you coming on and have a great rest of your week, Kaushik.
KAUSHIK RAHA
Absolutely, thanks Dave. You too.
Popular episodes
What It Takes to Productize Next-Gen AI on a Global Scale
Help Me Help You: Forging Productive Partnerships with Business...
Change Management Strategies for Data & Analytics Transformations
Listen how you want
Use another app? Just search for Data Science Leaders to subscribe.About the show
Data Science Leaders is a podcast for data science teams that are pushing the limits of what machine learning models can do at the world’s most impactful companies.
In each episode, host Dave Cole interviews a leader in data science. We’ll discuss how to build and enable data science teams, create scalable processes, collaborate cross-functionality, communicate with business stakeholders, and more.
Our conversations will be full of real stories, breakthrough strategies, and critical insights—all data points to build your own model for enterprise data science success.