Robert Stratton - When Worlds Collide: Neustar Reflections on Privacy, Data Science, and Analytics
"The more we can understand why things are happening, the more levers there are that marketers can use to affect outcomes."- Robert Stratton
In the seventh episode of No Hype, Robert Stratton, SVP of Data Science at Neustar, shares his expertise on privacy, data science, and analytics. Stratton and our podcast hosts discuss everything from what’s exciting about data science right now, how we can start applying tried and tested methods in new and interesting ways, and why Stratton believes privacy is more of a condition than a technique.
Listen out for Stratton’s take on over-hyped subjects such as machine learning, the role of clean rooms, and why he thinks we’re moving into a new era of information.
Allyson Dietz: Welcome to No Hype, the podcast about truth, science and the future of Marketing, brought to you by hosts Allyson Dietz...
Devon DeBlasio: ...and Devon DeBlasio.
DD: So today's guest is Robert Stratton, SVP of data science at Neustar. Robert has a PhD in computational modeling as well as in computer science and philosophy. He's over 15 years of experience across a range of roles in organizations, including managing director at WPP and analyst at PHD. Robert, welcome to the show. Happy to have you on.
Robert Stratton: Thanks guys. Great to be here.
DD: And so you are actually our first Neustar guest
DD: First and foremost, with a doctor in both computer science and philosophy, that's interesting. I'm curious, how do you see those two worlds colliding in the work that you do and the work you've previously done? What is actually the connection there? Why was that something you chose to do?
RS: Yeah. So there's a small field called computational philosophy, which is what I always thought I would end up getting into. And what they're doing is really simulating if people lived according to a particular philosophy, how would the world turn out. So if people interacted using some particular stated philosophy like, say Nietzsche philosophy or some philosophy of trust, then how would it actually work out in real life? What would be the things that developed over time?
And so these computational models to essentially work out whether a philosophy has some kind of inherent flaw or whether it's going to create other side effects which people didn't anticipate. It's a niche field at the moment. I'm constantly looking for new contributions to it. But it doesn't seem to be very active unfortunately.
AD: So Robert, tell us a little bit more about your role at Neustar and what are your core responsibilities?
RS: So my team and I work as the R&D team. We are mostly focused on data and algorithm oriented development. So really what we're doing is looking at different techniques that are out there in data sciences and information sciences. And we're looking at how we can apply them to problems that we have to either improve the products that we already have or in some cases develop new products.:
So we're not a pure research team, we're not academics who are necessarily innovating new methodology. We are really trying to bring existing methods from other fields and see how we can apply them successfully at Neustar. So a lot of the people we have come from academic backgrounds, but mostly day to day where we're looking at how to apply methods in new and interesting ways.
AD: Yeah. It's interesting that there's so much diversity of thought in the team itself because in much of your experience lives within the marketing and analytics space, with the focus in data science. So can you tell us a little bit more about what excites you about the world of data science, especially in light of your interest in philosophy as well?
RS: I think compared to some types of application where people want to get a prediction and they really don't worry too much about how that prediction comes about. So you want to predict, for example, is that a zebra in a photo or is that a particular word on a page? In those cases, you may not worry a great deal about how the algorithm is actually learning what is in the photo or what is on the page.
For a lot of the stuff that we're doing, particularly in marketing solutions, we're looking for methods which will help us to understand the process that created the outcome. So for example, in MTA, we're looking at what were the conditions that created a sale or in MMM, we're looking at what were the events over time that led to something happening.
And so I think what interests most of us is really trying to learn about how the world works using these methods and how one thing leads to another, I guess, is at the core of it. And we do some predictions. So some of the things that we're doing, we are simply interested in whether a particular outcome will happen given a set of inputs. But we also want to understand how the data process is working. And I think that's really the main motivator for most of the team.
DD: So this must be an understatement in saying this, but data science, at least from my perspective as a novice in that world is having a moment right now. Having a moment for quite a while. And that's due to the rise of data and the size and quantity of data sets. So why have you seen or what is your perception in terms of why data science and investment in data science, technology and resources and talent is so important to the marketing world, to everyday marketer at a brand, for example?
RS: I think it goes back to the same underlying concept. The more we can understand why things are happening, the more levers there are that marketers can use to affect outcomes. And so it's really, it's capturing what are the circumstances that surround some particular outcome and which of them can be either leveraged in some way so we can do more of one thing to lead, to example, for more sales or more brand response.
I think the second part of it is that it's proven to be successful in the field of marketing. In a lot of cases, you can change an algorithm and get a half a percent improvement in some kind of prediction. And there is a big return on that in many situations. So marketing is a field where there's a direct monetary response to what you're doing. And that's maybe not so true in some other fields where it may be more difficult to say monetize your image recognition or some other type of algorithm. So I think marketing there's an end to end, you learn something, you apply it, and that ROI is typically there on the research process.
DD: There's like a closed loop process that can be measured in some way shape, if you have the data though. If you have the data at your fingertips to a certain degree, I guess.
RS: Yeah. If you've got the right data and you've got the right way of of analyzing it, then you can succeed by just iterating through and learning different types of effectiveness, for example.
AD: I was just going to say, it's not just about, you said the right way of analyzing it. It really comes down to support too, both external support and internal support.
AD: One of the things that we're seeing a lot of in terms of, and what's happening with the brands and advertisers today is that they're building their own large data science teams. And I think there's those who are considering outsourcing those resources versus those who are building their own internal teams. So it'd be interesting to hear what you think in terms of recommendations. Should brands outsource or should they do it themselves?
RS: I think one of the advantages of having a product and the set of kind of pipes and systems to do the processes is that we are fairly well scaled and we've got a recruitment model that works for us. We've got a scalable system. It seems that if you're perhaps part of a small team in house, you're a little bit more vulnerable to say, somebody leaving or somebody getting promoted into a different part of the company. And to keep that system running with all of the elements that involves, just maybe more difficult in-house than it is for a kind of larger, more scaled operation like ours.
DD: Do you have any recommendations for brands or anyone listening who is a marketer in terms of their investment? Is there a place to start? Is there a size that you recommend? I mean, we're talking about fortune 500 brands, right? Let's think big, and that's your background here to your experience in terms of the clients that you've met with. Do you have recommendations from what you've seen, be successful, unsuccessful with people who are trying to set up those infrastructures?
RS: I think it probably takes a lot of commitment and a commitment to more than just the marketing science part of it to actually, to get the data, to be able to source the right data and get the overall framework of how to interpret the results and how to apply them. It probably goes beyond just a marketing science function. You need a good deal of engineering support. You're going to have to outsource software. And so I think the broader support network for doing marketing attribution or marketing measurement in-house, it goes beyond hiring a couple of data scientists, is actually, it's a much broader framework that you need to build.
AD: So this podcast is all about hype and we'd be remiss if we didn't hit on the top buzz words of the day, machine learning and AI or artificial intelligence. How do you think machine learning in particular plays a role in marketing today?
RS: I think probably two main applications. One is we were talking about earlier that the capability to learn why something is happening and then use a different levers that you're uncovered to change outcomes. So to change the number of sales you're making or to change people's perception of your brand. And the other is pure prediction. So if you know that something is going to happen so you can forecast a particular event, for example, then you can do things around that to optimize the particular things that you're interested in. So I think one is learning, one is predicting, and generally those are the two main areas that we're looking at, certainly at Neustar in terms of how we apply machine learning.
AD: Yeah, I agree. And I oftentimes find that a lot of people in the industry who play the role of data scientists or work in analytics, try to avoid machine learning because I think it can often be seen as a word that's over-hyped or overused. Everyone's marketing materials say that they use machine learning. Do you think that the term is over-hyped?
RS: I think the basics of machine learning is actually extremely simple. It could be any type of machine learning, the most basic thing. So it could be your thermostat learning that by turning on the heating, it can increase the temperature. So the most simple machine learning is actually something that's extremely trivial and the vocabulary of statistics and modeling econometrics, data science, AI is in constant evolution. I think partly because there's an urge to consistently reinvigorate it in different ways.
But in reality, a lot of the methods that are getting a lot of attention today were also around 20 years ago in a similar form. And so I think particularly in marketing, there's a tendency to continuously reinvent the wheel and put a new coat of paint on it. And machine learning in a lot of ways is... There are new things that have come into the field, a lot of important contributions. But the basic framework of what we're doing is still built around the same core principles that were true a decade ago or a couple of decades ago.
RS: And I think machine learning just sounds more interesting because it's closer to robotics or some notion of AI. But what falls under the umbrella of machine learning, it covers a very wide spectrum of different things.
DD: So another hot topic in the world of marketing or in another word that everyone keeps talking about is privacy. Let's break down three different, common most recently used types of methodologies or privacy preserving technologies. And so the first one is the concept of differential privacy. And we've talked about this sepa rately and it's more of a broader term like a machine learning. But can you kind of talk about what the concept or the overarching methodology differential privacy is and where the concept came from?
RS: So I think of differential privacy as being a cond¡ition more than a technique. It's kind of a guarantee that says, no one individual or no number of individuals contribution to a particular data process should meaningfully change the outcome of that process. And so it's a concept that you can apply to all kinds of different kind of data algorithms, including machine learning. But it's not a methodology so much, it's a kind of statement about the inputs and outputs of a given system.
And you could achieve differential privacy in different ways, under different circumstances. A lot of people think of say how would the return of a query change if you removed one person from the underlying data set? But you can apply the same concept to how would the results of a model change if you took out one person who is in the underlying data that you're building the model on. And so it's a guarantee about the inputs and the outputs more than a methodology for achieving that guarantee.
DD: Interesting. for our audience, I think they understand that the acronym of flux, but federated learning of cohorts, that's an attempt again at the browser level too. To add some level of aggregation or just... Just curious your thoughts on how that plays into the privacy piece. Is it more of a methodology? Is it something that you think is going to have room in the future state? Not to make you make a prediction, I'm just curious of your thoughts on what they're doing there.
RS: Yeah. So I think the flux come from more of the data obfuscation approach to privacy, which is, it's the school of thought that if you can anonymize a group based on some particular attribute, then you have a guarantee that at least that many people will share that attribute upon which you've K anonymized them.
So flocks from Google's point of view, seem to come from that point of view, that it's fine if you know that somebody has a characteristic, as long as a certain number of people, K minus one other people share that same characteristic. I think the difficulty with K anonymizing is that you have to consider all of the different data points that you might identify somebody on and K anonymize all of them at the same time.
And the trouble that Google seem to be running into with flocks is that, if you can cross reference that data point, the flock data point, with say other information about a browser or a user, then you've actually got more information about that user rather than less, because along with what you already had, now you've got a flock and that will give you another identifying data point that might, it might allow you to move from say a group of 20 people down to this group of two people that have that flock. And now Google have given more information. And that seems to be one of the criticisms that flocks are facing rightly or wrongly right now.
AD: You mentioned K anonymity. Can you define that for us? Because I know a lot of folks out there have heard the term, but may not be familiar with what exactly that means.
RS: So that's K, the number of people that must be in the group with the same characteristic. And they're anonymous, they're K anonymous within that group in the sense that there are at least 99 other people that share that.
You could take a couple of features that on their own and not particularly identifying. So you could take a date of birth, which might be shared by say 100,000 people. You could take a zip code in which 200,000 people might live. But at the point you combine them, you might only have one or two people in that zip code that have that date of birth. And so that goes back to what we were talking about with flocks, which is that you have to take account of all of the attributes at the same time when you K anonymize. It's something that's K anonymous in one dimension, when combined with additional information is no longer K anonymous.
AD: But the intention is to protect those in that group and to maintain that level of anonymity. I can never say that word, anonymity. See, I sound like trying to say an enemy every time I say it. But it is the goal to maintain level of privacy. For example, we do something similar at Neustar. We do this with cohorts of 100 and based on similarities and ad exposure. So I think that it's really about the goal in which the process or the technique is being used to protect that privacy.
RS: Exactly. Yeah. And it requires all of the K anonymity algorithms that we're building at the moment. They're multidimensional. And so if you do have more than one dimension that you need to protect, then we consider all of them at the same time. I think to protect on one dimension and then separately to try and protect on another dimension, it actually ends up defeating the purpose. For the most part, you need to consider all of the attributes that might be relevant to a person and apply that anonymizing process at one time, rather than separately.
DD: You did mention research that was created by Neustar that we put out there. ICOM obviously picked them up and did publish it. Can you just kind of give us a brief kind of synopsis of what the research is and what the point or what you're trying to prove in the research that you put out there Robert?
RS: Yeah. So what we saw was that there's a lot of research in academia, and a lot of it takes one method,, it applies it and says, okay, this privacy method, say differential privatization of something, it worked over here. If we apply it to machine learning, for example, it's going to make that private too.
A lot of the time if you look into the practical realities of where those methods might be applied, they don't work in the same way and certainly they don't have the same interpretation. If you take something that's designed for say a query and you apply it in some part of say, machine learning model calibration process, it doesn't necessarily create the same real world privacy as it did in the original application.
And so what we did was to take a kind of real marketing measurement pipeline from the point that you're collecting the data, you're processing the data, a data owner is transferring it to, say an analyst. So it might be at a different company or within that company. That analyst is building a model and trying to learn inferences about the data and then putting out some kind of results from that model.
And so what we did was look at different ways of applying privacy along that pipeline. And so one of them... We were just talking about K anonymity, that one of the things that we applied to the raw data. So that's saying, let's look at the data and apply the privacy there so that everything downstream from that will be private. And then the second thing we looked at was to say, if we take the data in its raw form, which in some marketing applications would still be a viable way of going about things. And instead of protecting the data, we protect the model learning process so that the model learning process is not sensitive to one particular individual that's in the original data. Then can we create, can we still create viable marketing solutions, products that respect privacy?
And the overall outcome that we found was that in both conditions protecting the original data and also protecting the model with a good degree of privacy applied, we can still derive useful practical marketing recommendations. And there's not a kind of either or situation where you either have privacy or you have marketing attribution. We found that in the middle there is quite a big, happy medium where both can be achieved. And the paper points out a couple of kind of sweet spots where with a good level of privacy, you still have a good level of accuracy in your marketing analytics.
AD: So effectively both win. Is there approach you would recommend to others in the work of marketing analytics? If someone was looking to apply one approach or the other, would you recommend one or the other?
RS: I think like everything in privacy preservation, it really depends on the conditions that you're working in. And so the analyst would always prefer to see the raw data. And so the analyst would prefer to get the raw data and then apply privacy in the learning process. That's not always the case. And sometimes some particular party might be unwilling to share that data without having K anonymized it.
RS: And so in that case, you are working with the K anonymized data and you want to make the best application of it that you can. And so I think it really depends on what the situation is, what the condition are that you've got access to and then how you can apply those different methods given the privacy objectives that you have as well. So it would be great to say there's one size fits all, but I think in privacy, it's really an area where you have to customize to your conditions.
AD: One of the things you mentioned earlier was this pipeline of the flow in which data comes across from a certain source to an analyst to ultimately use for modeling purposes. So one of the key terms that we hadn't brought up earlier was clean rooms. So what is all the hype about as it relates to clean rooms? And do you have any thoughts in terms of the role clean rooms might play in helping to protect privacy as part of that process?
RS: I think if we look around the market and the literature on clean rooms at the moment, the definition of privacy, again, it varies quite substantially between different approaches. So for some people, clean room just means you trust the party that hosts the clean room, you put your data in there, and then because you trust the party that has it, it's clean. For others it's a more restrictive approach where they might want a guarantee that actually the third party is not trusted and the third party cannot see the data that you've submitted into their system.
And then at a deeper level, it might be that you want to share data with still another party within that third party's clean room. And you want guarantees that the third party will not be able to learn anything about your data that you've put in there. And so, again, how you apply privacy in clean rooms, it really depends on what the restrictions are that the different data owners are going to apply and the different technologies that you're enabling to allow different things to happen will be driven by that.
DD: So it just seems like we've been talking, but pretty broad terms that kind of include a large amount of different procedures and practices and methodologies that kind of all bubble up. And again, the main goal here is to, like you said, have privacy and your cake and eat it too to a certain degree. But where is all of this actually going? We're at the cutting edge, we're at the beginning of this process now because people actually care about privacy and marketing as a unified approach. And they didn't really from, I would say, in the past, because they weren't forced to. But where do you think the next 10 years or so, do you see this happening? Is it one and the same, is it something that we're still kind of working to evolve over time? What are your predictions there?
RS: It seems that broadly we're moving from an era in which information has been somewhat freely shared between different parties. And in fact, there was more information than was needed for a particular use case. So you're looking to do some particular type of attribution and you got additional information that you didn't need as well.
I think now we're moving to an era where you only get the information that you need. And beyond that there will be additional obfuscation on the data that is being provided such that we are not getting all of the information and that data, we're just getting enough to still fulfill the use case that we are trying to apply.
And so I guess the end state of this is that businesses will still need to use information to achieve their objectives, but that information will be consistently degraded until there's only just enough to do what any particular company needs to do with it and know more than that. And that's where things like K anonymity come in, because actually you are reducing the dimensionality and arguably the entropy of that data down to the point where people can do the bare minimum thing that they need to do with it, but they can't do anything else. And they can't learn accidentally about any individuals who might be part of that data that's being shared.
DD: That's interesting. So a random question I have though is, has data kind of been a distraction per that statement? So in the terms of the quantity of data has moved faster than our ability to actually extract what we actually need from it to do our jobs as marketers, as advertisers, as just businesses. Is that kind of where we are? Are we just trying to wade through the slog of the sheer mountain of data, and just assuming that it's great to have all of it, but it's actually been more detrimental to do our jobs?
RS: I think in a lot of use cases, the return on processing, say logs from some particular process may not have an ROI so that it may be that the data's available. But by the time you've collected it, stored it and analyzed it, the difference that you can make based on what you've learned about it may not be cost effective.
DD: The storage is going to cost you more than what you're going to gain from the ROI of looking at the data. That's interesting.
RS: Yeah. And I think in the data sources that we have, we've learned what to keep and what to throw away by exploring it. But for sure there are new data sources out there, which it may not be worth our time in the time and cost of acquiring them.
AD: But there's even cost in what you just described, isn't there? In terms of being able to take data and determine whether or not it's useful and then setting it aside. I mean, that in and of itself has cost implications. And particularly as we talked about earlier in the podcast around these marketers and marketing organizations who are trying to build data science teams.
AD: It's more than just having one or two data scientists on staff. It requires a fair amount of work and structure in order to set up that process. So I think that there is a lot of value in simplifying. What you're describing is almost like a simplification in terms of big data, being able to take only the parts you need, and to really clearly define those use cases.
RS: Yeah. And that's arguably another advantage of a scale provider like us is that we can evaluate that data on behalf of all of our clients and make a decision on whether it's useful or not. If one particular client has to evaluate all of those different things in turn, then they don't get the benefit of that scale of learning that we offer.
AD: Yeah. That's the benefit of having a best in class data science team on staff here at Neustar. So Robert, thank you so much for joining us. I think it was very educational. I appreciate you getting into the weeds with us around some of these key concepts, these hyped up terms, machine learning, privacy. So thank you for joining us today.
RS: Well, thanks for having me.