"Unstealable" Device-Based Identity Resolution
Fraud Happens Every Day and is Showing No Signs of Slowing Down
As cybercriminals continue to become more sophisticated, businesses require authoritative identity signals to spot fraud quickly, while letting legitimate consumer interactions through faster. During this webinar, learn how the use of dynamic digital data and "unstealable" device-based attributes can both prevent sophisticated fraud and reduce consumer friction.
- Confirm the person behind the device is truly who they say they are
- Provide low-friction experience to prevent abandonment
- Prevent loss from fraud
All right, everyone. We're very happy to have you here. We're going to be starting the webinar in probably just about five to six minutes, let everybody file in, and let those attendees accounts level off. But I've taken a look at this presentation, and spoke with the expert.
And I can tell you there's some really, really interesting. And I think it's incredibly helpful thought leadership that's going to be on display for everyone. So very excited to present this to you.
This will be fighting fraud and minimizing consumer friction with unstealable, device-based identity resolution, presented by Neustar. So as I said, we'll be starting officially i probably just about five minutes.
We're going to be starting in just about one or two minutes. But it looks like a great turnout from a very diverse set of professionals, from banks to insurance companies. We have investigators, auditors, so excellent, excellent group here to learn about some very important issues that can really help you better detect and prevent fraud. So we certainly are going to — probably about one to two minutes, once the attendee accounts level off.
I'll say this in the formal presentation, but if anyone has any issues, any sound issues, the slides are not getting updated fast enough, just hit the Refresh button on your browser. This is all browser-based. It should get you caught up very quickly. If you have any issues with sound, just make sure that you're plugged in to a hard line, or as close as possible to your wireless access point. That should fix many of those issues.
Fighting Fraud and Minimizing Consumer Friction
All right, well it looks like attendee counts are leveling off, and we can finally start. Good afternoon, everyone. And welcome to today's webinar, fighting fraud and minimizing confused consumer friction with unstealable device-based identity resolution, presented by Neustar. When it comes to opening credit card accounts, making purchases, or accessing accounts, consumers increasingly prefer digital channels to get the job done.
Although widespread web adoption drives down acquisition costs, the largely anonymous interaction also creates a critical vulnerability for fraud. Fraudsters can engage in social mirroring, or exploit the personal identifiable information, PII, available through high profile data breaches to circumvent fraud measures and open fraudulent accounts or take over existing accounts. As cyber criminals continue to become more sophisticated, businesses require authoritative identity signals to spot fraud quickly, while letting legitimate consumer actions through faster.
Join us on this journey to learn how the use of dynamic digital data and unstealable device based attributes can both prevent sophisticated fraud and reduce consumer friction. Attendee learning goals include finding out all of the latest trends in fraud detection and prevention, understanding what tools can brands leverage to reliably unlock the identities of these digital customers, getting to know how reliable real time digital identity verification works, and what types of data are needed.
My name is Bryan Monroe. I'm VP of content here at ACFCS. I really want to introduce our dynamic speaker, but first, just a few quick notes on logistics and for the presentation. This session is being recorded. And it will be posted to the site and made available for ACFCS members within three business days. You'll also find the slides on the site.
And you'll receive a follow up email notifying you when the slides are posted. At the end of this webinar, you'll be asked to take a brief survey. Please do take a minute to fill that out and provide your candid feedback. It really helps guide us on topics and content you'd like to see and improvements we can make in the future.
Also, we want to make this presentation as interactive and useful to you as possible. So please do ask questions. You can type your questions in the questions box. We'll be fielding them throughout the presentation, as well at the end of the session, if time allows.
Lastly, if you have any audio issues or the slides are not updating, here are some ways to fix them. First, just refresh the browser. Second, try getting closer to your wireless access point. Or, best of all, plug your computer straight into a hard line.
Also, you can open the browser through your phone and listen that way. Lastly, let me explain a little bit about the association. ACFCS is a leading provider of practical tools, knowledge, and deep relevant content to help professionals improve results in financial crime detection intervention. Through membership live and online training in the unstealable credential, our goal is to deliver resources enabling better performance in financial crime compliance across all disciplines, including any mail fraud, cyber, and more.
The CFCS designation is the only credential to validate knowledge and skill across the financial crime spectrum. On that note, we've recently released an updated certification with refreshed questions, and a new online prep course. In short, ACFCS aspires to be the most relevant, tactical, and practical association across all areas of financial crime compliance through the best membership rewards, most forward looking training, and offering a certification that is the broadest and deepest in the industry.
Now, about our presenter. Sam Jackson, Director of Product Management for Risk Solutions at Neustar. Sam is a technologist an entrepreneur with a background in identity services and ad tech. As one of the product leaders for solutions at Neustar, he leverages industry leading identity data, real time signals, and machine learning to bring world class anti-fraud solutions to market. Sam, I'm excited to hear what you have to say. So I'll turn it over to you.
Thank you, Brian. So with that great introduction, I will kick off here. Just a little bit more about me, obviously, I manage products for Neustar with a focus on fraud prevention and other types of solutions related to risk. But before I got into this particular role, I was heavily involved in digital advertising.
And I sometimes joke that I'm basically an expert at tracking people on the internet. So that's very pertinent to this conversation today. I'm going to be talking about some trends in fraud prevention, and the industry as a whole. And then we go pretty deep into how we actually reconcile devices with identities.
So by the end of this, I hope that you guys will know a lot more about how digital tracking works, both for advertising and fraud prevention. Sorry, having some issues with the slides. To get going, let's talk a bit about fraud.
Massive Increase in Fraud Online
The last decade has seen a massive increase in fraud on online channels. In particular, the adoption of EMV chips in credit cards means that fraudsters are increasingly pushing their activity to digital channels, where you don't need much for presence at all in order to go ahead and commit a crime. So, previously, it was fairly simple to get your hands on a credit card number, and write that to a magnetic strip for a counterfeit card.
Today, that's very challenging. So, instead, fraudsters are using data breaches and other types of compromised information to go about the exploitation of institutions, and merchants, and so on, primarily in online channels. And this doesn't really show any signs of slowing down. The rate of data breaches has only gone up in the last five years.
At this point, pretty much everybody's information — if you're an accredited adult in North America, it's leaking around on the dark web at some point. And, likewise, other activities, like account takeover, and other systematic breaches of systems, are just increasing. In the last year alone, we saw about 125% increase in these types of breaches.
So this means that a lot of the traditional mechanisms that brands have used to fight fraud no longer work so well. They have to kind of turn to other measures. So to give you an example of this, it used to be that brands could reliably turn to their credit header data to verify identities of that account opening.
Credit header data at this point is compromised with synthetic identities. Fraudsters have figured out that they can take social security numbers from kids, and go ahead and apply for loans, and actually invent entirely new individuals. And, of course, that says nothing about all the identities that they're stealing. Heaven forbid, that happens to you. It could be a nightmare to actually get out of the kinds of debts that they've racked up on your behalf.
Investing in Fraud Prevention
And so to mitigate all of, this businesses are spending more money than ever on fraud prevention. Last year, or I think 2017, brands and other organizations spent close to $10 billion trying to fight fraud. And that has nothing to do with the actual losses that they're incurring. That makes it even greater.
If you're a merchant, for example, oftentimes you're paying a fee for underwriting. You're paying a fee for fraud platforms. And then you're also paying a fee for lost merchandise, when people place orders on behalf of individuals who were not actually intending to buy anything. So fraud is becoming a big business. Mitigating fraud is also a big business. And, today, I'm going to be talking about some of the things that various companies and organizations can do to slow down fraud, and in some cases really stop it in its tracks.
So when we're talking about all this, and we're talking about fighting fraud, generally speaking, we're trying to answer a simple question, which is, does the person behind the device truly have the identity that they're claiming? Or are they the person that they claim to be? And, in some cases, this is a question that's necessary to answer for purposes of regulatory compliance.
But, more generally, it's something that we care a lot about, because it has an impact on the customer experience. And it has an impact on our bottom line. So if you put a customer through too many hoops in order to prove that they are who they say they are, you're going to create a lot of customer friction.
And that may cause them to ultimately abandon their order, or block a transaction, or walk away from your service, or just think badly of you and do less business with you in the future. So customer friction is a big concern. But then, of course, there are other concerns as well, like the losses that you take when fraud occurs. And the fact that fewer people may be adopting the services that you want to offer.
Rise of Device Reputation Tracking
So, anyway, all of this drives various players in the industry, be it fraud practitioners, or platforms, or data service providers, to try to come up with new and novel ways to answer this question, is the person behind the device truly who they say they are? One approach that's gained quite a lot of traction in the last decade or so is something called device reputation.
So device reputation vendors, most of them work more or less the same way. They deploy technology to gather what are known as device fingerprints. So the device fingerprint is typically a series of characteristics that can be gleaned from a device that provide a persistent view in internet device over the long term.
So many of the things that you might associate to a device, like a security token, or a browser cookie, or something like that, they may expire. Or it may be that individuals can tamper with them so they go away. That's kind of the benefit of device fingerprinting.
And then they rely on a cooperative with their customers, like a consortium, to provide feedback on which devices and which identities were involved in the safe transactions, and which ones were involved in fraudulent transactions, until, over time, this consortium data collectively allows these brands that are in the business of device certification to offer risk scoring to their customers. So somebody lands on my website.
They read the device fingerprint. They say, OK, this person has previously been involved in safe transactions. And so they're probably pretty low risk. You can go ahead and let them through. Or, alternatively, this particular device was involved in a chargeback scheme a couple of weeks ago. You see it again, you should be wary of it.
This has become kind of the predominant approach that brands use when trying to mitigate fraud in online channels. And it's driving a lot of big acquisitions. So, for example, in 2018 LexisNexis acquired Threat Matrix.
Threat Matrix are one of the leaders in device replication. And LexisNexis is a major provider of identity data, what we think of as an identity Bureau. And my recollection is this acquisition was for north of $800 million. So it was quite big. If you think about this from a strategic perspective, it appears that LexisNexis who were kind of an authority on person centric offline information, are looking to acquire something like Threat Matrix, because Threat Matrix has information about digital identity.
So, in effect, they're trying to bridge the gap between physical identity and digital identity. And, similarly, TransUnion acquired iovation. And iovation is another device reputation company. And TransUnion is another identity bureau.
So, again, you see that the identity bureaus are very interested in trying to bridge the gap between person centric, offline information and device centric digital identity. But from what we understand talking to our customers, we haven't really merged our solutions yet. These companies are still selling device reputation as a service. And LexisNexis and TransUnion are still selling their physical identity information as a service.
Connecting Device Identity and Physical Identity
But they don't really have a solution that can conclusively answer the question, who is the person behind that device? And that's really where we come in, my company Neustar. We are specifically very focused on answering the question of, who is the individual behind the device? And in order to do that, we've basically managed to put together a lot of linkages between device identity and physical identity.
And this is quite a big claim. I don't think there are very many other companies out there that can do what we do. So I'm going to go into a lot of depth throughout the rest of the presentation, just how we go about doing that.
History of Neustar and Number Portability
But first, let me give you a little bit of background on Neustar, because you may not have heard of us. So we were originally spun off from Lockheed Martin to fulfill something called the North American Number Portability Administration Contract, or the NPAC. And, basically, this is a service allows you to take your telephone number from one carrier to another.
So let's say you're on T-Mobile. And you want to switch to Verizon. For a long time, we were in the business of providing you with that service. And it was good business for us.
And as a result, we were able to go out and acquire quite a lot of other companies. And we increasingly moved into many different verticals. So, today, we have services, spanning advertising, cyber security, fraud prevention, and telecommunications. The thing that is common to all of the different services that we offer is that they're all involved in identity resolution.
From Security to Telecom Solutions
And identity resolution is something that's very relevant to fraud prevention. So, for example, on the marketing side, a big portion of our business comes from digital advertising. And, in particular, we're very strong on cross device measurement, and multi-channel attribution of advertising.
So imagine you're watching TV on your smart television. You see a commercial come on. Then you switch over to your tablet to make a purchase. We help brands resolve that string of circumstances, so that they understand, OK, we show this person an ad on their TV. And then they ultimately made a conversion purchase on a different device. Basically, that style of cross device, cross-channel measurement is absolutely essential to our business.
On the security side, we do things like DNS resolution. So you want to figure out how to reach a particular website, you go to your browser, you type in the URL, we resolve that domain to a specific IP so that you can reach that website. And on the telecommunication side, we control the vast majority of the North American caller ID network.
And so when you receive a phone call, and it says, here is Sam Jackson calling, we're oftentimes the ones in the background providing that service. Likewise, if you receive a phone call that says, this is a spam number, that's also part of our business. And, of course, one of the things that underpins many of these different offerings is identity information.
Who is this person? Who owns this device? And so forth. And so those are the kind of signals that I leverage in my job day to day, building new products for fraud prevention.
Under the hood is something that we call the one ID system. It's kind of a marvel of technology. Basically, we take billions of records corresponding to the physical identities of the North American population. And we combine those into profiles of the United States. And likewise, on the advertising side, we're involved in servicing about a quarter of all global ad impressions primarily for measurement. And so we have an awful lot of insight into who is doing what on their devices. And together, under the umbrella of OneID, we bring all these different signals together to create services that do omnichannel identity resolution.
Offline Identity Resolution
Before I get going into how this works from a digital perspective, I want to give you guys an overview of how offline identity resolution works, since this may be new to you. So a very typical service that many of our customers leverage is basically PII verification. So, today, all of the top 10 largest financial institutions, all the credit card issuers, are our customers.
So when you go and you sign up for a credit card, or you sign up for a new account or do a transfer, oftentimes are data is in the background helping to verify your identity. And specifically, what happens is they'll query us with your name, your address, your email address, and so forth. And we respond back.
We tell them, yeah, we've seen the telephone number with this email. Or, yeah, this phone matches this physical address, or this name matches this email. And we use this with data that comes from over 80 different services, including from the telecommunications contracts that we have.
So it's very reliable. And it's corroborated across multiple sources. It's not one single source of information. Additionally, we have other data points available that can be very helpful.
So we can tell you about a telephone. Is it prepaid? Has it been ported recently? Was it sim swapped? These are all indicators that could suggest that somebody's identity, while legitimate, has recently been compromised.
So, for example, if your phone was ported 12 hours ago, and now you're initiating a money transfer, there's a very good chance that a fraudster has managed to fool your mobile carrier into issuing your phone number to a new device, which they're now in control of, meaning they can intercept one time pass codes, and that kind of thing.
So understanding all of this, whether or not these identity attributes link together, and whether or not your phone is still in your control, these are all very powerful signals that oftentimes test very well in the models of some of the largest companies in the world. But that's really just offline identity resolution. That's identity verification in the physical world. I've been making some big claims about how we're actually able to do this from a physical identity to a digital identity. So now, I'm going to explain how that works.
Digital Identity Resolution
So in order to resolve identities at scale on the internet, we collect an awful lot of data about devices and about email addresses. So, typically, you may go to a website and log in because you want to see some content, or maybe you sign up for the first time. And you provide your address, and your telephone number, and all that kind of stuff.
We are in the business of buying tens of millions of dollars worth of these linkages that basically connect the device identity to physical identity. So, for example, you might have a cookie ID, which we're able to set via our advertising network. And we will link that to an email address, which we can then tie back to a physical address.
And that's what we call a direct linkage. In other cases, we see these same kinds of linkages, device to email or device to physical address, on a particular IP address. And we see that that IP address is a fairly low volume IP address. So there's not a whole lot of traffic coming from it.
Therefore, we're able to infer the IP address as belonging to a specific household. And by doing this at scale over the course of our entire advertising network, we're able to eventually identify the majority of devices that are out there on the internet. And this is really essential for the kinds of measurement that I was mentioning previously.
When you go and you watch a commercial or something like that on one device, then you switch over to your laptop to make a purchase, these are the types of linkages in the background that major advertisers and brands are using to figure out that these devices were connected, or that an advertisement that you saw at work ties back to the purchase that you made when you got home. So this is really essential to providing the kinds of advertising services I mentioned before.
Before I go on, Brian, do we have any questions from the audience?
Can device identity fingerprinting being cloned?
Yeah, actually there are some pretty interesting questions coming through here. One of them is asking, so basically about, can this information be copied? So here's the question. Is there risk on device identity fingerprinting being cloned? Or is it something that's dynamically calculated every time it is requested?
That's a very good question. So that was about some of the stuff we were talking about previously with device reputation.
Yeah, this is it's a little bit different than what I'm talking about right now. But, yeah, well we'll go back and talk about that really quickly. In fact, I may go back to that slide so that I can make this really clear. Apologies.
OK, so the other question is basically in regards to device fingerprinting and device reputation. So you want to monitor these devices, and ultimately assess the risk associated with the device that maybe you've seen before. So in order to do that, you have to have a fingerprint that's really robust.
So what that means is the fingerprint really has to uniquely identify that particular device. And it can't identify like a whole lot of other devices as well, because then you wouldn't be ensured that it's really kind of the same person that it was when we saw them previously. So there are some challenges with this.
When we talk about device fingerprinting, generally, the efficacy of a fingerprint is measured in terms of what's called entropy. So entropy is basically a way of understanding the amount of surprise factor that goes into a bit of information. So something that's very low entropy doesn't really carry much surprise. It's very predictable.
So there's not very much variety. Something that's very high entropy has a high surprise factor, meaning you really don't know how you're going to get. And it seems more random, or more chaotic. So a really good device fingerprint is very high entropy, meaning you're not going to pool another fingerprint out of a whole range of other devices that match it.
It's going to be a unique identifier. And, likewise, it's going to be fairly constant. It's not going to decay very much over time.
Unfortunately, the reality is that a lot of fingerprints don't really have those characteristics. There are certain devices where there just isn't that much information that you can get from it that would uniquely identify it. This is particularly true through browsers.
And as brands like Apple moves to kind of block fingerprinting, you're going to get lower and lower levels of entropy, meaning it's going to be more or more rare that you see a device that's really uniquely identified for a long period of time. Maybe it'll be good for a little while. But then some new system fonts or something get installed, or a new browser plug-in.
And the fingerprint is no longer as unique as it was before. Maybe it's changed. Likewise, of course, this stuff could be intercepted. It could be stolen. I think the risk of that is fairly low.
But in order to gather all these different characteristics, brands need to deploy a particular library to capture them. And so if hackers get their hands on that library and have the possibility of reverse engineering it and seen how it works, then they could go out. And they could gather their own fingerprints basically by installing these libraries on websites that are more malicious.
And once they have those fingerprints in hand, they could also reverse engineer the libraries themselves to pass that fingerprint forward, and thus appear to be a different device. So you have a number of different problems here. You have devices that just can't be super uniquely identified. And then I guess in fringe cases you have the possibility that fingerprints have been intercepted or synthesized in some way that would throw off this approach.
Typically, to overcome these kinds of concerns, it's not just a fingerprint that's used. It's also an IP that's used. But everybody could be aware that there is it's kind of increase in terms of concern about privacy that's happening with browsers like Firefox, or Apple Safari, where they are trying to start to block fingerprinting, because there's concerns that brands are using this for pervasive tracking for advertising and things like that.
So today, I personally think that these approaches are pretty robust. They're not perfect. You really have to have seen the device in the past in order to know its riskiness. But they're not completely compromised.
Whether or not that's still true five or 10 years from now, I don't know. But fraudsters are always kind of adapting to these different strategies. And so it's very likely that in the long term, it would be compromised.
Are Neustar identity services available in Europe, Canada, Globally?
That's a fantastic — I'll let you know just the questions that are coming in. And feel free whenever it's organic to grab these, but just some interesting questions on kind of integration with OneID and other platforms, but also the regionality of this. What you can do here, is it available in the EU, Canada, places like that? But you can grab those now or later.
Yeah, I'll answer it quickly. So I'm going to go into a lot of the different types of tracking that we do. And most of the stuff is available in North America. It's more challenging overseas, primarily because of restrictions like GDPR.
So our ability to offer these services, and this is true for most identity brokers. It really is dependent on a specific region, because just like financial institutions, and brands, and whatnot, we have to comply with these regulations. So the amount of data that we can collect, especially collect without consent from the end user, varies greatly from one region to another. The United States tends to be fairly forgiving, although that's starting to change with the adoption of the California Consumer Protection Act. But, certainly, the European Union with the GDPR is generally much more hostile to the kinds of data collection and tracking that powers many of these kinds of solutions.
OK, so going back to digital advertising and our ability to reconcile identity to devices, as I said before, we buy a lot of these linkages from different companies, effectively data that they get when people log in. So you want to see a video, you want to download a white paper, you're entering your personal information to access that content.
And then, in turn, many of these brands are turning around and selling that information to various advertising entities, such as ourselves. Because of our massive scale, and the fact that we're so involved in measurement, as well as other things like delivering targeted advertising, we certainly buy up a lot more of these linkages than almost anybody else out there.
We have some of the widest coverage, in terms of our advertising network. And in order to make that possible, as I mentioned before, we have to spend tens of millions of dollars on an annual basis basically buying these signals so that we can figure out who the person behind the device is. So I think this diagram is good.
But maybe it doesn't go deep enough in terms of explaining how this works, kind of at the level of the individual. So I'm going to talk about a couple of different types of linkages that are very relevant to us. So we have the direct linkage. I mentioned this before.
Example of Digital Identity and Direct Linkages
So imagine that you have this guy John. He likes to watch sports videos. So he goes to the website of some advertising publisher that has a lot of sports on it.
Maybe he wants to watch the highlights of the game. So in order to do that, he has to either sign up or log in. And when he does so, he provides his email address.
At that time, typically the publisher in question, the advertising publisher in question, will turn around and fire a tracking pixel to us, or execute device fingerprinting, or in some cases send us a mobile advertising ID that's associated to his mobile phone in conjunction with that particular email. And so that allows us to set an ID for that particular device, and connect it back to the email that he has that they're using in order to kind of get access to their content.
In turn, we take that email address. And we look it up in OneID. And because we have over a billion email addresses in OneID, and they're connected to other things like a physical address, or IP address, or telephone number, or so forth, we can oftentimes link that email to a much broader set of identifiers that allow us to kind of complicity say that this particular device is linked to John.
So that's what we call a direct linkage, also referred to in the advertising industry as the deterministic linkage. Separately from that, you can imagine the John lives with somebody else. We'll call her Meredith for this exercise. And maybe she's very privacy conscious.
She uses an iPhone to browse the internet. And so we're not able to set third party cookies. Maybe she uses a fake email in order to get to certain content that she doesn't need to validate her signup upon. So maybe she's like MickeyMouse@disney.com or something like that.
So you would think that we wouldn't really have any purview into who she is. But the thing is, she shares a Wi-Fi router with John. And we have a pretty good idea as to who John is, and where John lives. And we see these two devices over and over again on the same Wi-Fi router together.
And as a result, we're able to infer that that particular IP address that belongs to that particular Wi-Fi router is associated with John's information. So we see that he loves at 123 Broadway. And we figured out that the IP address for this particular Wi-Fi router is also, therefore, 123 Broadway.
And not only that, but we see their devices together on a regular basis. And so, therefore, we infer that they're a household. And that even though we don't have a direct linkage for Meredith, tied into her particular name, or her particular email, we still can figure out that her device is frequently at the same address as John, and that she probably lives there. So that's what we call an inferred linkage, which basically gets you to the level of the household.
So how do we use this for identity resolution? Well, basically, you can query us with the personally identifiable information of these people. And we can return to you a bunch of different data points, or a score, that basically tells you how much we're able to corroborate the information they've provided against what we've historically seen with their device, or their IP address, on the internet.
So, for example, we can say, yes, this device has historically been associated with this particular email address, or this device has been associated with this physical address. We have seen it at 123 Broadway. Or maybe we don't have an exact match for those identifiers.
But we have seen it in other locations. And we can tell you, well, we saw it within half a kilometer of where this person says they live. Or we've only seen it 4,000 or 5,000 miles away.
Likewise, it may be that we can transitively link it to these identifiers. So we've seen these two or three devices together. And while we don't have any information for this particular device, if we look at another device it's been seen with in the past, we can link it back to a physical address, or a particular email address, or a particular name.
And, of course, we can also do all this with IP address. So, yes, we've seen this particular email used on this IP address before. Or, yes, we've seen this physical address used to buy things, and send them to this particular IP address before. So in all of these different ways, we're able to provide similar verification flags that you would see normally when you do the sort of offline verification, the verification of a physical identity. So before I go on, Brian, any more questions on the phone?
Does OneID integrate with other platforms?
Yeah, actually there's quite a few questions, and really good ones. So one person asked, this is about kind of integration question. So does OneID integrate with other platforms, like Meridian, like for example? I apologize. I'm not as familiar with these.
I don't know about Meridian but we certainly are in the business of working with other platforms. If you think about what we do as a service, we think of ourselves as the information service provider. So we give back data. And in some cases, we give back scores.
On the other hand, there's many different fraud platforms out there that do more than that, but maybe don't have the same scale of information that we have. And so we work with many of them. So a typical integration would be one where a brand is using you a fraud platform to basically manage their manual review, or assess particular orders for fraud risk, that kind of thing.
And they're able to get a lot of value out of the first party data. So have they ever seen this person make an order before? Or did this person just make a couple orders for a few bucks over the last week, and now, suddenly, they're ordering 10 laptops?
Brands can get a lot of value out of their first party data with these kinds of fraud platforms that do predictive modeling and then help with annual review. But then, at some point, maybe they still have a fraud platform that they can't address just with those data points alone. And so they'll turn to data service providers, or information service providers such as Neustar, to get extra lift for their fraud models. And so to that end, we work with many of the largest fraud platforms. We sell them data. And we also have integrations that their customers can leverage, if they want to get that information to their platform.
Are there identifiers using retina scan and voice recognition?
This is kind of similar, kind of talking about combining what you have with other pieces of data. But this person says, OK, the speaker is now talking about device fingerprinting. Is there a combined identifier by using retina scan and voice recognition as kind of additional levels of enhancing the device reputation approach?
Yeah, very interesting question. So there are certainly things that you combine device fingerprinting, which is really separate from device reputation, with other deeper levels of authentication. So for a long time now, device fingerprinting has been used outside of device reputation as just part of the solution for identity and access management. And so, certainly, companies that have major enterprise risk, or governmental organizations, they're going to use a multi-point approach to authentication that could very well combine device fingerprinting with other forms of biometrics, like retinal scanning, or even behavioral biometrics, like the way you hold your phone, or the kinds of signals it gives off when it's in your pocket and you're walking.
So that's a big kind of rapidly evolving area. Typically, you see that more around identity and access management than like account origination or transaction scoring, which are those use cases primarily used around device reputation. But increasingly, I think we're going to see more of an emphasis on things like facial recognition and biometrics, like fingerprint scanning in your phone, at the stage of checkout for a merchant, or interacting with your bank, or so forth. So I think these different types of solutions are rapidly collapsing into kind of common services that can be deployed across a lot of different use cases.
Does OneID include browser fingerprint in identifying device reputation?
And then I'll just ask one last question and let you dive back into the presentation, which I'm really enjoying. OK, so this person asks, do you also include browser fingerprint in identifying device reputation? If yes, do you have a precision of browser fingerprint versus cookie fingerprint in terms of identifying an identity?
So that's a really interesting question. So we have developed browser fingerprinting solutions. We don't use them in our ad networks, primarily because the exchanges that we run on forbid us from doing so. We do partner with a range of other companies that sell standalone device fingerprinting solutions.
And so, certainly, some of our customers are leveraging a combination of our technologies and those types of technologies. Generally speaking, device fingerprinting is much lower precision than a cookie. But it may be more persistent than a cookie. So a cookie, we give you a completely random ID. There's basically not going to be any collisions with any other devices.
Technically, you could copy it from your browser manually and move over to a different browser. But that would only be if you're already in control of the device, meaning that you've already more or less owned the person's device. Otherwise, a fraudster is never going to get their hands on somebody else's cookie because it's all happening over HTTPS, meaning they don't have any way of getting that unless, again, they're already kind of in a man in the middle position where pretty much all bets are off.
So the cookie ID, the benefit of that is that it really is like one in a trillion. There's no collisions. It's completely precise.
The downside, on the other hand, is that people sometimes clear their cookies. Certain browsers don't accept third party cookies anymore. So there are some coverage gaps there.
On the other hand, with a browser fingerprints, most of the time it's going to work. But for certain types of devices and certain types of browsers, there are going to be collisions. So maybe it's high precision 85% of the time, where there's another like 15% where it's not high precision, and it doesn't work so well.
Similarly, it may be that looking at certain characteristics that are necessary to get to that level of precision, but they're ultimately not very stable. So common things that device fingerprinting libraries look at would be the particular photographs that you have on your phone, or the fonts that you have installed in your browser, or the plugins that you have installed in your browser. All of these different things can change over time, meaning that they can add to instability in the fingerprint.
So while it will give you precision, it'll mean that the fingerprint actually decays and ultimately expires after some period of time. So they're very different technologies. And they have different characteristics, and different kind of strengths and weaknesses.
I don't mean to set up our approach as being better or worse than device reputation. We think we have a very strong approach. We also think that other vendors have a very strong approach.
But both of us have gaps in our coverage. We can resolve an identity to a particular device at the level of the name, or the phone number, or something like that, about 55% of the time. So that's considered to be pretty good coverage on the internet.
I would say device reputation, they're going to have similar coverage levels. They're not going to be able to identify everybody, because they're only going to have purview into the devices that they've seen before, and whether the fingerprint hasn't decayed or changed in some way. And so we think of these different solutions as being orthogonal to each other, meaning that if you're a company that has the budget to buy both, you're probably going to get a lot of lift out of having both good solutions running side by side.
That said, if you only have the budget to buy one, this isn't really a pitch for the product insofar as I'm not going into all the other things that we do, but we have put together a pretty comprehensive menu of different data points beyond just the stuff that we're talking about here that collectively make for a lot of predictive power. So in the interest of time, I'm going to move forward.
There's definitely going to be some more room at the end for questions. So I was just talking about these online to offline linkages, and how we use them for identity verification. We actually can go even further than this, and do what I call a graph search.
Range of Data Sources for Digital Identity
So you have a particular device that belongs to a person. There's multiple ways of getting to identities from that device. Certainly, we have a browser history built up of different websites that they've been to, and so forth, then we've probably seen that device on many different IP addresses.
And all of those different IP addresses could conceivably be linked to particular households with different individuals living at them. Likewise, that device may have some deterministic or direct linkages that tie it back to a specific household or particular identity. Furthermore, it may be linked to other devices via device graphs, where we've seen these people traveling together.
Or they're constantly kind of on the same Wi-Fi routers. And so we think they're part of a family. Or maybe they're dating, or they're otherwise related. So there's a lot of different ways to get to household identities.
And once you're at the level of the household identity, you can pull up all these different physical records that ties the people who have lived in this household. And what we'll do is we'll actually search over all those different individuals to see if we can find any match, because we kind of think of it as looking for a needle in a haystack.
If a particular device has been seen in conjunction with a particular personal identity, then that's a really strong corroborating signal that this person is, in fact, who they say they are, or they're trustworthy. Maybe I'm borrowing my girlfriend's telephone or something like that. And so we'll give you match flags, similar to typical PII verification.
So does this email match the device graph? Does the address match the IP history? Have we found a match for the first name, or the last name, and so forth?
And all of these different signals can basically corroborate whether or not a person is who they say they are. But, again, sometimes you find an identity via this graph search that just happens to have some of the same identifiers as the person in question. So, for example, my name is Sam Jackson.
Obviously, there is another very famous person out there who shares my name. He goes by Samuel Jackson. So there is the possibility that you would just inadvertently find a device that's linked to a similar identity. And, therefore, you wouldn't be able to trust that corroboration as much. We actually can overcome that, though, by thinking about the rarity of the identifier.
So let's say that we find that this device has historically been seen in conjunction with somebody else who shares my name, Sam Jackson. Not only will we tell you that we've found a match. But we'll actually tell you the rarity of those identifiers in the overall population of the United States, as well as the number of individuals that we search.
And so taken together, those two numbers of the rarity and the number of individual search, can be combined to give you the probability of a false positive. So in addition to telling you whether or not we saw a match for these identifiers with the device, we'll also tell you the likelihood that it's just somebody else who happens to share that same identity.
So, collectively, this allows us to boost the overall scale that we would have, if we were just trying to deterministically link to a particular device to an identity. And it allows you to evaluate the kind of overall value of those linkages, when taken together. Beyond the stuff we get from internet browsing history, we also do what's called MNO header identification.
So we're basically able to query the carriers without the user knowing, since they've consented to this already when they signed up for their mobile contract, and basically get back the telephone numbers to the device. This assumes that the device is currently on a carrier network. In those cases, we can take the telephone number. And we can see if there's a match between the user's name, or the user's email, or the user's address, as well as other signals, like is the phone prepaid?
Which could tell you that it's a cheap phone, throwaway device, they don't want to be tracked. And you know it's like a burner phone, and they're trying to use it for fraud. Likewise, we could tell you if the phone's been ported recently.
So is the fact that this number is associated with this individual still a good indicator? Or should you maybe be a little bit distrustful of this phone, since it could be with a new carrier, and it could therefore be in the hands of a fraudster, or another individual? So this is a very powerful technology, when it's available. Unfortunately, it's not universally available, because it doesn't work over Wi-Fi. But when it's there, it provides a lot of boost to fraud models.
Finally, in addition to all of these kinds of person-centric linkages between devices and identities, we also do a lot with geospatial information. So, again, going back to the fact that we're very big in the business of advertising, something that's become really hot lately in the advertising space is geospatial information. So I go to some app. Like, let's say I want to get navigation directions, or search for a good restaurant near me, a lot of these free apps that you're using leverage GPS to do that search for you.
They're turning around. And they're selling that geospatial information to the advertising ecosystem. And not only that, but because we may have seen this device on many different IP addresses over the course of several years, we can look up those different IP addresses. And we can look up the geospatial observations that have come from these advertising applications, and tell you, have we ever seen this device or this IP in proximity to the place where somebody said they live, or where they're trying to ship a package?
And so this is another great set of signals that are completely orthogonal to other data that I was mentioning that really use raw signal from the device, or that's captured on a particular IP address, to help to corroborate an identity claim that somebody is making when they're getting ready to sign up for something or transact with your company. So this stuff is super powerful for fraud prevention. In fact, just to give you a little anecdote, this graphic that's on the screen right now, it corresponds to one of my colleagues.
He basically lives in the Bay Area. So we saw a lot of information where he's kind of traveling to and from work from the Berkeley region. But then we also see in this time period, they went down to our offices in San Diego, probably spent some time in the hotel, maybe at a restaurant as well.
And then this is last summer. He actually went up to Portland, Oregon to view the solar eclipse, and then traveled down to Salem, Oregon, which is that dot below, where it was a bit darker, so that he could actually see the eclipse down there. And so this is the kind of information that we're able to collect based on people's browsing habits.
And we don't just treat these data points as a single observation. There are other services that say, OK, for this particular IP address, here's the best latitude and longitude that belong to it. We've found that it's much more powerful to actually treat these data points as a spatial distribution, and give you back information about how all of these collected data points work together to either corroborate or refute a claim of identity.
So if the average distance, or the 25th percentile distance, is 4,000 kilometers away from where a person claims they live, there's good reason to be suspect. Whereas, if you have some portion of this distribution, like let's say the bottom 10% or 20% of observations, all fall within a quarter kilometer of where the person said they live, well, then, it's very obvious that they've been at that location. And there's a pretty damn good chance that they live there. So, again, this is an incredibly powerful source of information for fraud prevention.
Q & A
So in the interest of time, I'm going to wrap up a little early. I did have some other stuff I was going to talk about. But I think we better take some questions. With that, Brian, anything from the audience?
Yeah, there's been some really just absolutely fantastic questions here. Let me try to get through some of these. So this was a really good question.
Is there enterprise liability regarding customer tracking vs data mining?
So this person asks, this seems to expand outside the borders of data mining into actual tracking, which would be extremely valuable for law enforcement who would definitely need court orders to conduct such tracking. However, what's the liability or legal exposure for the enterprise in terms of tracking, versus data mining? And to what extent is the customer aware of the level of consent that's being given? I know you kind of briefly touched on that now. But I thought this would be a germane question.
Yeah, that's a very good question. So having worked in advertising for a long time, there were certainly major concerns about some of this stuff. It can be very creepy.
The reality is we never give out the actual location of where somebody says they live. We'll just give you data points that are like, OK, yeah, we can corroborate this. You submitted an address. And we can say, yes, at some point we have information that would corroborate that there's some relationship between this device and that address.
Or in the case of the geospatial stuff, we're going to give you a relative distance. Like, oh, we have an observation that is within some radius of where they claim to live. Not, here is the specific address where we've seen them.
On the back end, we do have that data. And I guess they get into the questions around consent, we're extremely diligent making sure that all of the data we collect has appropriate privacy policies, or is captured in a way that is friendly to the regulatory conditions in whatever municipality that we're operating in. So, for example, if you download an application, let's say to check the weather, and it uses your GPS signal to do that, and then they turn around and sell that to somebody else, the reality is you've actually agreed to terms and conditions that outline the fact that they're going to do that.
You may be uncomfortable with it. But that's how they make their money. That's really the condition of using the different applications.
Now, having said that, we are very involved in a variety of different regulatory groups concerning how data is collected for advertising. We comply with a wide range of different policies around this stuff. And, very importantly, we follow very strict regulations in terms of data governance, in terms of how we actually store this data to make sure that if there ever was a data breach, a hacker couldn't quickly download a database or something that just has all of the information collected together. We go to great lengths to make sure that it's secure, and that various tables that are needed to kind of join this information together to do these kinds of queries are always discrete, and separate, and basically managed in a way that protects the consumer's privacy.
I'm going to get into some of these — I'm going to get to some of these last few questions in a second. But there's one that I thought of, because a lot of the professionals I work with are anti money laundering professionals at banks. And the trend here is kind of convergence, breaking down silos, so that the AML, the fraud, the cyber teams, are kind of working together, rather than kind of working in parallel.
It seems to me that this kind of information could really help in kind of the overall financial crime risk assessment of customers, either kind of getting to see if this person is medium or high risk, or even if you have a customer population that's a low risk that you can kind of better prove to the regulators they're all a risk if you kind of have this information backing up that they've done low risk things. Like you said, they're not changing phones a lot. They're not going to these high risk areas. It just seems like this might be kind of potentially woven in to either the AML side or the fraud side risk assessment to overall justify that customer score, risk assessment score.
Yeah, I think it's absolutely true. And the way we think about this stuff, we're much more focused on trying to identify safe customers than really tell you that somebody is dangerous. I mean, there are certainly indicators that people maybe are dangerous. Like, we'll see that certain IPs are actually coming from a data center, or an anonymizer proxy, or a VPN network.
We invest enormous amounts of resources into identifying those types of characteristics. But more than anything, we're basically in the business of saying, yeah, this person is who they say they are. And we have data going back years or more that corroborates this identity. And so there isn't much reason to question them, because when you consider the tenure of this particular identity of our network, it's very safe.
It's much more problematic and challenging, to say, OK, this person is really, really high risk. We would rather that the organizations are interacting with these individuals go in and ascertain that themselves, maybe by flagging the individual from manual review, asking them to provide some documentation, that kind of thing, because we don't want to engage in any sort of discriminatory practices. There's always the possibility that we get it wrong. So we'd much prefer to kind of focus on the cases where somebody is very safe, and provide that kind of information.
No, that excellent point. And like you said, it's very nuanced here. It's a very fine line that you have to walk there. But you're clearly aware that the stakes on either side.
Does CPB consider any consumer linkage acquisitions out of scope?
This person asks — this is related to the linkages you were talking about, and the link acquisitions to kind of put all the linkages together, they asked, are any of the linkage acquisitions from consumer protection bureaus, or the native data input is strictly outside of their purview?
No, absolutely not. Again, privacy compliance and protecting consumers rights are really top of mind for us. I mean, that may sound cheesy after I go through this whole presentation, which is effectively about understanding digital identity. But you have to understand, that if we get it wrong, there's so much risk of reputational damage that our business would really be on the line. So whenever we're buying data from anybody, or building relationships with publishers — and we work with many of the world's largest advertising publishers — we put them through an enormous amount of scrutiny to validate their practices, and confirm that everything they're doing is above board.
Excellent. We just have a few more questions. I mean, do you have time, Sam, for maybe just one or two more questions? I know we've technically hit the 2:00 PM mark. But we have a very, very engaged audience here. So I would just want to give them as many chances as possible to really engage with you, because this is a very new — again, this is thought leadership that is very difficult to come by.
When I was looking at this presentation, I've been covering specifically financial crimes and compliance for, 8 plus 5, 13 years. And there's things in here that I didn't know about. So that's what I love, is doing presentations where I learn and we get to share that with the rest of the community.
And as he said, there's a lot of things here that can help on the fraud side, on the AML side, on the risk side, on kind of really delving in to see if this person is who they say they are. And that goes specifically right down to financial crime risk. So, OK, so one of the questions here is, so let me get into the ones here.
Which techniques prevent long term attacks?
OK, so, OK, we know that fraudsters, sometimes they'll create these identities and make legitimate purchases they kind of build up a positive reputation before going for some big fests. What are kind of the techniques you discussed today to prevent those kind of long term attacks?
Yeah, that's a really good question. So, certainly, something we see, in particular merchants suffer for from, but also credit busts out attempts, things like that, is sort of pattern of behavior where a fraudster will get access to an account, or they'll create a new account. They'll do some very legitimate things that appear entirely fine.
And then at some point, once they've built some trust up, they default. And they do run out off with a bunch of money. And so merchants suffer from this all the time. Credit card issuers see this a lot.
Somebody will have a week or two of really good purchases before they then go and charge $20,000 on a card. It's a real issue. And it's something that I think the device reputation players really struggle with.
So in our case, we're lucky, because the scale of our network means that, oftentimes, we have information that goes way back in time. And so the way that we look at this problem is that the individual linkages that we provide, the individual data points that we provide, they can show some red flags.
But if they all appear to be good, there's still another dimension that has to be considered, which is the temporal dimension. It's really the tenure on our network. So even in cases where we don't necessarily have the ability to back up a claim of identity corroboration, we can often look at just the tenure of a particular device on a network, and whether or not we've been seeing it for a year or two. And that signal alone is incredibly predictable for fraud.
Because if you think about a fraudster, they're probably clearing their cookies on a regular basis. They're doing things, because they don't want to get caught. They're trying to disguise their identities.
So they're not going to sit around and try to mature an identity by browsing the internet, and just acting normal, and paying their utility bills, and things like that, for like two years before then committing a fraud. They want to be able to open multiple accounts across different service providers, do enough stuff that appears legitimate, and then turn that around as quickly as possible, because that's their business. They're trying to be efficient.
And so the lack of tenure on our network is an indicator of potential risk. And the inverse is also true. The more behavioral data that we have going back, the more likely it is that a given individual is, in fact, who they say they are.
That's a very important point of context, everyone. So one thing a bad guy can't do, and this has been spoken up by other professionals on webinars where we talk about kind of fraud and digital identities, one thing a bad guy can't do is manufacture years and years of normal things done on the internet — posts from social media, posts from Facebook, things that you've done online that are specifically identified to you. They can't create years and years of that.
How do you understand relationships between devices?
So that's, obviously, a very important contextual point. I guess two last quick questions here, something I just thought of — so you mentioned that before in your example about two people kind of living together, the router and stuff. But I was curious, so your solution has all these different graphs to monitor devices that kind of belong to the same households. But can you go into more detail a little bit about how that works, because I want to know if you see my device with my co-worker, how do you know that we aren't part of the same family?
Yes, that is a very interesting question. So this is all related to kind of the data science of how we construct our identity graphs, and in particular how we do household clustering. So as you mentioned, we work with things that are called device graphs that basically show that particular devices are linked together into what we think of as family units.
And so the key to this is really understanding how humans behave in the real world. So let's say you and your wife and your kids live together. Well, there's a good chance that during the day, you go to work. And so you're active together on the same Wi-Fi router in the morning and the evening, but maybe not so much during the day.
Likewise, you probably travel together. So maybe we see you at the same airport together on the same Wi-Fi network there. Well, that alone wouldn't be a very good signal, because there's thousands of devices going through the airport every day. But if we then see you go to a hotel together, where you're now staying in the same hotel together on the same IP, then that would add additional weight to it.
On the other hand, if it's your co-worker, then we probably see you together on the same enterprise IP. So in my case, it would be the Neustar VPN network. And then maybe let's say you're in sales, you go to some conference, maybe we'd see you together at the same hotel IP.
But what we don't have is all the nights and weekends stuff that would really suggest that you're living the rest of your life together. So parsing out these different signals, and really understanding what a normal lifecycle looks like for a family or for an individual, is one of the really fascinating aspects of the data science that happens behind the scenes in order to make all of this possible.
What about synthetic identities?
One last other question, and then I guess I'll start to wrap things up. And thank you for being so gracious in going past time her, Sam. What about synthetic identities? Is there any way that kind of what you've put together here can ferret out these synthetic identities, which, again, the bad guys are trying to create something that fools people, looks like you, talks like you, and get access to your money, or money on your name? So that's kind of a big deal.
Yeah, so I think the answer to that really is that it requires a definition of what a synthetic identity is. So there are, conceivably, other types of synthetic identities. But, generally, when people talk about them in the industry, what you're talking about is an individual who has stolen a social security number, typically from a kid, or a deceased person. They've combined that with some made up information, like a fake name, maybe a legitimate address where they're able to capture mail, maybe a legitimate phone number, maybe a legitimate email.
And then they go and they get some loans somewhere. And after getting that loan, or that line of credit, the issuer of that credit line turns around and submits their data to the credit bureau so that now this made up identity is suddenly circulating in the creditor files. And because the bread and butter of many of these identity bureaus is basically reselling credit header data, and it's frequently used for fraud prevention, other lenders will query the credit header data when this person then applies to another loan.
And they'll see, OK, yeah, this person was issued credit before. And so they must be a legitimate person. So we'll give them a loan too. And so very quickly, this tends to snowball, the point where a person has an identity that's fairly well corroborated in the credit header data.
The question is, where else is it corroborated? Are there other signals that would demonstrate that this person is in fact who they say they are? Well, maybe not.
I mean, certainly there are data brokers who were taking some of this stuff, and reselling it. And, certainly, some identity bureaus are using these other maybe weaker sources in addition to credit header stuff in the data compilation they do that feeds their verification solutions. But, again, going back to what I was talking about with tenure on the network, it's unlikely that somebody is both taking out these fake loans, and then also just going about their daily life reading content on the internet, paying their bills, and so forth, using this fake identity.
And so what we see with synthetic identities is that they tend to have a lot of information that's present in the credit header data, or maybe one or two other sources. But there isn't really a lot of corroboration anywhere else. And there certainly isn't much corroboration on the behavioral side.
And so we're developing models that kind of catch this. Certainly, other vendors are doing the same. And, yeah, we think that the trick is to not take the credit header stuff too seriously. Take it with a grain of salt, and absolutely leverage other sources of identity when you're doing verification or transaction scoring.
Well, this is — at least for me, this was extremely enlightening, kind of opening my eyes to a lot of things that I hadn't really thought of, just because I talk to a lot of compliance professionals, like I said. And they're in some ways a little bit limited in what they have direct access to gauge these risk assessments. But what we've talked about today, that's a whole other layer of really being able to find out, is this person are who they say there are?
Are there kind of strange things that have happened tied to their name? Which with all the data breaches and things going on, that's more and more a realistic possibility. And that's something you absolutely need to check with for of a wide variety of risks.
So everyone, I think we should — we're going to wrap this up here. We owe Sam a huge round of virtual applause for taking his time to share this information with everyone. This is very dense, a very interesting, not an easy subject to talk about.
But I, frankly, think he just killed it, because I really enjoyed it. If you have anymore — like I said, it's tough to get all the answers to your questions in just an hour. And Sam was gracious enough to go over time to help answer some of these.
But if you want to chat with him directly, his email is right there, firstname.lastname@example.org. Now, a last little bit of housekeeping items before we wrap this up — several of you have asked, how can you get access — because this is very dense material. You want to go over it yourselves. How are you going to get access to the slides and the recording?
If you're an ACFCS member, you get this automatically as part of your member benefits. This is usually in about a week. We'll get them on the slide. You'll get a notification. And you can check them out.
If you are not a member, we still want to give you access to this. We're not pure pay to play. We care about sharing thought leadership so you can better arm yourself to detect and prevent financial crime.
So what I did is my email address is email@example.com. You can reach out to me directly. In the chat box, I sent a link for a free 30 day membership, so that when you get that membership you have full access to the site for 30 days. You can download not just this presentation.
But you can listen to our entire catalog of webinars covering pretty much every area of financial crime. So as I said, and very, very important, I talked about the survey that we have. Please do fill that out, because we are here as an association to serve you. What are your issues?
What are you worried about? What are the vexing challenges? What are the kind of gray areas that you need guidance, you need thought leadership? What do the regulators mean? What are the bad guys doing?
That is what drives us as an association to help better arm you to detect and prevent financial crime, not just be in compliance with the regulators, but truly get relevant, timely intelligence to law enforcement to make sure you have a bulletproof compliance program. So everyone, like I said reach out to me, if you want to get the slides, if you're not a member.
Again, thank you, Sam. Thank you, team Neustar. This was a fantastic presentation. And I look forward to seeing all of you at the next ACFCS webinar or live event. Thanks again, everyone, and hope you have a great rest of the day.