We recently caught up with Robert Woolfson - Ph.D. in Computer Science from Manchester University, now heading up the data science team at Wix.com.
Robert, firstly thank you for the interview. Let's start with your background.
Data science has been a very small part of my life so far. Data has not. During my career, we were using artificial intelligence mostly neural networks and genetic algorithms to create automated trading systems, trained, optimized, and tested first on historical data, before being put into action in live trading. My Ph.D. focused on building a framework for optimizing, testing, and proving the reliability of trading systems. In my previous employment I also started learning about big data and Hadoop but couldn’t really figure out a way to practically use it for what I was doing, but I grasped the ideas coming through about parallel, distributed systems, and parallel processing was something I started to think about.
Finance data was always my wheelhouse, but I did get tired of it, a number of years doing a Ph.D., a number of years working with it. I was frustrated always dealing with the same problems and I felt like I needed a new challenge. That’s when I started more actively looking at the field of data science and I ended up interviewing with Wix and I had no idea they were dealing with big data at the time. The interview was what really got me excited, we did a case study, which was an analytical example, not specifically using big data - more theoretical and it was very interesting to me, I did well enough to get past the interview and here I am.
Thinking ahead is really the key to doing good research.
What was the first data set you remember working with? What did you do with it?
The first data set that I remember working on was during undergrad, I was playing with historical financial data. Historical prices of Dow Jones and historical average from the last 100 years. That’s the data I was using to run through the neural network to find different patterns in the movements of the daily changes. It was an undergrad project and it was focused on more on creating the neural network work rather than accurately predicting stock market movements. At the time I was extremely interested in trading and the stock market, it was all about numbers and data. But it was an undergraduate project and I've come a long way since then. The technology has changed and the data has got a lot bigger.
What do you think of the title Data Scientist?
Some people really like “data scientist” because it sounds better than an analyst. It's just like anything, you can call yourself an engineer, programmer or developer depends who you’re talking to. I don’t really care. Sometimes when people don’t know what a data scientist is I will tell them I’m an analyst and sometimes I tell people that I'm a programmer.
Now, let's talk a little bit about Wix.
We have a range of different types of analysts at Wix. We try not to make too big of a deal about titles, we can have any kind of title we want as long as its not Chief something. My internal title is “data magician”, one of my colleagues is “professor of data science”. We just mess around with titles because I never want to separate myself from the team, we all work together and everyone does a job that's partially analytical, partially statistical, and partially data science.
Tell me a little bit about Wix. How would you describe it to someone who is not familiar with it?
How do you use data science at Wix?
I am pushing an analytical process with my team at Wix. Coming from a Ph.D. background I start with a hypothesis, coming up with a hypothesis, testing my hypothesis, I’m not specifically talking about P-value testing. What I’m trying to implement is a thought process. We have a start to finish piece of research, I want to look at the data and know-how to ask the right questions. It's not about saying “can we look at this data because it might be interesting”, it's about looking at the data that we can ask the right questions from. And before I look at the data I try to ask those questions, the answers to which are either A or B. Those are my 2 options, then I can take it one step further. Before starting my analysis I sort of know what I to expect and therefore I know what I am going to do if I get answer A or answer B. My path is set in front of me and it’s clear. I just have to do the technical bit to find out which way to go next.
The final step that we have in our methodology is that if you have an insight into a change that needs to be made, it is the job of the data scientist to be the product manager for that change. So we take it through the conceptualization of the problem, which will be worked on with a product manager or senior staff who comes with an issue, we then ask the right questions and create a hypothesis which will lead to actionable insight and manage that change once we get the answer. That’s the data science process that we are trying to implement across all analytical tasks within Wix.
What have you been working on this year, and why/how is it interesting to you?
At Wix we know that offering a better product means more users will convert to premium. We use data to provide constant product feedback. We have the products themselves, which the developers are building. Features in the products send data as events, that data is stored in a Hadoop filesystem thanks to the BI team, who are data architects and engineers. That data gets aggregated and stored into an SQL database. The entire analytical team has access to both Hadoop and the database to research that data and to give feedback. Creating a constant feedback loop to the management about how we’re doing and also feedback to the products themselves about what their performance is like, if there are any issues or any problems. As a result we can constantly be on top of creating great products. A lot of that involves A/B testing, the analysts who sit with the product teams manage the A/B tests so any time we want to make a change we can see if it's hurting us or helping us.
If we find issues with the performance we focus on decreasing the lag time, etc. The knowledge that comes from the data is our niche, we’re currently the link between the knowledge and the data for everyone across our organization. At the moment the analyst is the bridge between the data and the developers and the product managers. But as time goes on we’re looking at building tools that will help with that, so the product managers themselves can get access to the data sets, which will essentially free up the analysts, so they can focus on deeper research.
The knowledge that comes from the data is our niche, we’re currently the link between the knowledge and the data for everyone across our organization.
We’ve gone through a sort of transition. Wix did a Super Bowl ad, which was all about the concept “start a company, build a website.” We’re not just about building a website, there’s a business element there as well. We released hotels last year and that allows people to manage a small hotel, you can add rooms to your hotel on the front end and also manage the bookings on the backend, including the payments and its all in one system. What we’re looking at next in terms of data is how can we continue to use that data to help our users be more successful with their own business. The whole company is involved in helping people build their businesses.
We’re also very inspired by OKCupid and Pinterest data blogs, we’re thinking about implementing something like that at Wix. The only problem with this initiative is that a lot of data is proprietary. So we’re thinking about creative ways to release the data insights and talk about it.
What Data Science methods have you found most helpful?
The process that we are implementing at Wix is a tool. A part of the methodology is being able to present at any time. I want people to start their research with a blank PowerPoint presentation and start building it as they go along. The great thing about that is, you get constant feedback from your team, that way you can save yourself a lot of time. What I’m always thinking about is how I can manage these processes better, how can I make my data scientists more efficient.
Any words of wisdom for Data Science students or practitioners starting out?
The key is to learn how to ask the right questions. Asking the right question but also thinking about what the outcomes could be. Something I learned in high school from a physics teacher, he said, “just estimate what the right answer’s going to be before you’re even going to touch a calculator or computer.” You can probably estimate it close enough, think about it, at least you will know when you get an answer back from your computer if it's not the right order of magnitude then you have made a mistake somewhere and it needs a second look. Thinking ahead is really the key to doing good research.
Thank you so much for your time! Really enjoyed learning more about your research and what you are working on at Wix.