Digital Identity: Fighting Fraud With Next Generation Device-To-Identity Signals
Is the person behind the device truly who they say they are?
If a business doesn't invest in fraud prevention, it opens itself up to financial losses and reputational damages. Why does this matter? Because fraud has a direct impact on your relationship with your customer, specifically the trust that customers have in your business.
As cybercriminals continue to become more sophisticated, businesses require authoritative identity signals to spot fraud quickly, while letting legitimate consumer interactions through faster. In this webinar industry experts discuss how using next-generation identity linkages allows businesses to intelligently and reliably sort users into high- and low-risk buckets.
Now without further delay, let's begin today's event, sponsored by Neustar and hosted by American Banker. I'd like to introduce your moderator for today, and that is Mike Sisk. Mike, you have the floor.
Thanks very much, Adam. And I would like to welcome the audience once again. We are very grateful that you've chosen to share some of your busy day with us. We know that your time is valuable and we will honor that today with what I'm confident will be an engaging 60 minutes or so of discussion. Once again, our topic today is Digital Identity, Fighting Fraud with Next Generation Device-to-Identity Signals.
And my name is Mike Sisk. I will be your moderator today. I'm a contributing editor at American Banker, and my articles have also appeared in Barron's, Crain's New York Business, Inc, Institutional Investor, strategy+business, and [...]. And I am very pleased to introduce our two speakers today.
Sam Jackson is Director of Product Management, Risk Solutions, at Neustar. Sam is a technologist and entrepreneur with a background in identity services and ad tech. As one of the product leaders for risk solutions at Neustar, he leverages industry leading identity data, real time signals, and machine learning to bring world class anti-fraud solutions to market. And we're very pleased to have Sam with us here today.
And joining him is Merritt Maxim, VP Research Director at Forrester, serving security and risk professionals. Merritt covers identity management and access management, including user provisioning, access governance, customer identity and access management, identity as a service, single sign on, user directory infrastructure, and internet of things security. His research builds on his 20 years of experience helping security leaders at global enterprises drive optimal business value from their identity and access management initiatives for employees, partners, and customers. And he also co-authored the book Wireless Security. And we're very, very pleased to have Merritt here as well.
And just before we jump into things here, I wanted to reiterate one thing that Adam mentioned at the top. And that is we do have time at the end for Q&A. We usually set aside 10 or 15 minutes. You can put those questions in the queue throughout the hour as they occur to you. We encourage you to do that.
And sometimes we don't get everyone's questions, sometimes we run out of time, sometimes it's just a matter of a question being something that's better addressed one on one afterward. But for whatever reason, if we don't get to questions you've asked in the time we have today, we will definitely follow up [afterwards.] So your voice will be heard. Please ask away, we very much want to hear from you.
And with that, I think I'm done with my little opening spiel here. And I will officially turn things over to Merritt to get things rolling. Merritt, the floor is yours.
Hi. Great, thanks, Mike. And good afternoon or good morning to those of you for joining.
This is Merritt Maxim from Forrester. And I'm going to spend the next 20 minutes or so just giving you a perspective from Forrester or from our interactions with clients about what are some of the key things that are driving issues around fraud and digital identity and how they're addressing those. And then I'll pass it over to Sam, who'll talk a bit more about how Neustar addresses some of these challenges.
So, goes without saying today, digital approaches are everywhere in every industry. And digital transformation is a big term but it's an important concept that companies, particularly financial services or in other industries, are rapidly engaged with to help reinvent their businesses and better serve their customers. And if you look at the next slide here, when we actually look at what's driving digital transformation, some of these are kind of interrelated, whether it's cost reduction and customer acquisition. But you do see that customer experience ranks rather high. And this is something that we see quite a bit in the financial sector as companies move to more and more digital interactions with their customers and less in person branch type interactions.
How that user experience is designed in a way that is easy to use, but also doesn't compromise on security or fraud, is a real challenge. It's something that companies in banking are investing quite a lot in. And that comes out in this survey question. And it's a trend that we'll continue to see, because customers' expectations are changing and [...] responses as well. There's no single point in time where they're done in terms of what they view as the optimal user experience. Their expectations are always changing and therefore, you always have to be evolving along with those to make sure that you continue to meet the needs and doing your best to interact with your customers.
When we look at this next slide here, we live in an omnichannel world now today. And banking is a good example of that, where the days of the physical teller being your primary form of interaction are largely going away now. We have mobile check deposit, we have — short of going to the ATM to get actual cash, most banking transactions now, even applying for a loan, applying for a credit card, et cetera, those can all be done through digital interactions and touch points. And in the large financial institution that has different lines of business that are run by different people, building that user experience becomes very important here.
And the challenge here is that consumers are creating very complex journeys. So maybe they start the loan application on their mobile device during their commute home on the train, and then they want to pick it up maybe later in the evening at home on their home computer. And so it is an omnichannel world, and you can't assume that the customer's journey is only going to be consummated on a single channel.
And so this creates real pressure on making sure that you've got not just a easy to use experience but that the experience is consistent across all channels, because your users are creating different touch points. At the same time they're creating touch points, each of those touchpoints is now a potential fraud vector that can be compromised to conduct fraudulent activity. And so the onus on the financial institution is not just to design optimal experiences but to do these in ways that don't compromise our security or fraud, which is a real challenge.
Digital Privacy Regulations
And at the same time, particularly you may have heard of the EU requirement called GDPR around personal privacy and consent. We now have a version in California called the CCPA, which is an equivalent regulation, and other states in the US are considering similar user privacy initiatives. That creates yet another new requirement now in terms of you need to have explicit consent, you need to have the user the ability to change or adjust their consent at any moment in time.
And in CCPA and GDPR, the consequences for non-compliance can be quite significant. GDPR actually allows for companies to be fined up to 4% of their global annual revenue. Now, we haven't seen fines sanctioned at that level yet, but that's in the law and that is a possibility. So there are consequences if you haven't thought through the privacy and content angles appropriately as well.
And [...] to that point, and hopefully you can see the specifics on the slide, we also understand as we move through the digital touch points is that the customer identity now comes in many different varieties. In the old days, you had a physical person present themselves in the branch. Now you may have a cookie, you may have the app on a mobile device, and maybe there's some fingerprinting that you do on the device to confirm that it really is the user. And this just gives you a sense of some of the very different types of identity forms that a given institution may collect over time. And the challenge for many companies is how do I do what's essentially called identity resolution.
How do I know that user name jsmith1234 is also John Smith who lives on Garden Street in this town? It's how do I resolve all these different attributes down to a single identity. That can be very challenging in a very diverse environment, but it also has an impact because from a user experience, the user has the expectation that their institution should know that jsmith1234 is also John Smith. And so when they call the call center or they authenticate into a new app, they shouldn't be forced to re-register or answer a bunch of questions, because their assumption would be this institution should already know who I am. And so this diversity of the components of customer identity creates a lot of challenges in making sure that you're able to get a single holistic view of the customer.
Corporate Digital Information
At the same time, this applies more on the enterprise side, we are in a BYOD world. And we are in a mobile world, in terms of how users are interacting with all the services they use in the course of not just their personal but also professional lives. And so this creates new challenges in terms of what an enterprise can mandate or require on an actual device.
In the days where the company handed out BlackBerrys to employees, and they owned and paid for those BlackBerrys, you had more control over what the users were able to do. Now, when you're essentially allowing anyone with a properly equipped smartphone to access your systems, that creates new challenges. And [...] will resist if they feel that there are too onerous requirements placed on them and they may actually decide not to even use their personal device with corporate information. So you have to be aware of the BYOD implication as well.
So when we look here, customer expectations are changing. And when we actually look at what's in the fraud space, we've got a range of different types of payments that are [gone.] And certainly, the newer generation is expecting faster payments processing, but they're also using different payment mechanisms. My children use Venmo. They have bank accounts, but Venmo is their primary mechanism by which they share and disburse money across colleagues or even to purchase items.
And that's a very different dynamic than what people of my generation were brought up on. And so as these expectations are changing, this creates a different challenge as well. And so when we look at the user experience and the implications there, you need to be aware of how this applies and there are more and more channels now that need to be mandated and tracked from an overall fraud perspective.
So I think when we talk about the user experience, the good news is that consumers do care about security. And part of that is a function of the different breaches that they've encountered over the years. And they're now more and more aware of security problems than they have been in the past and therefore generally probably practice better online security hygiene and also are more mindful of sites or services that they think do a poor job security versus those that do a good job. And so the increased caring about security is an advantage for institutions. If they can demonstrate a strong commitment to security, it can actually be a potential differentiator against other institutions, a way to attract customers.
But when we look at fraud, it's ultimately about the balance between the three Fs. You've got the actual fraud management process itself, you have friction — that's the customer satisfaction and user experience. So how do you design this in ways that don't overly unnecessarily burden the end user? And then you've got the false positives, which is a scenario that you try to minimize because that creates a bunch of headaches both from a customer satisfaction and a support scenario. And so fraud management is ultimately about balancing these three Fs out.
And it goes without saying, I don't need to tell this audience that fraud affects a bunch of things in terms of profitability, reputation, and compliance. And there are consequences if you're not doing a good job with that, which is why companies continue to invest a lot in fraud and why cyber criminals continue to try to find ways to subvert that. And when we actually look in this survey here, look at what are some of the higher priorities within the financial services and insurance sector. Like, fraud management ranks very high. And I think that's a reflection of the continued sophistication of fraudsters and how they continue to try to leverage new attack vectors to better commit fraud, to gain access to where there's actual data that they can use for identity theft down the road or to actually collect monetary money that they can then use to transition into gift cards or other things that they can then use to purchase other items.
Fraud has traditionally always been led by the financial services sector, but fraud affects other industries as well. I certainly interact with lots of clients in other industries, ones that are, say, in retail or e-commerce, and they face many of the similar problems. Whether it's a credential stuffing attack, an account takeover attack, or actual fraudulent attacks at the point of sale, all these things are scenarios that they are aware of.
And so while fraud is still — banks are probably still the prime target, many industries now have a fraud problem and are trying to figure out how to best manage that. And this is a problem that's only going to persist because cyber criminals are sophisticated and they will go where they believe defenses are weak. And if they believe a certain industry or type of use case is vulnerable, they will apply their resources to try to compromise that to the best of their ability.
Understanding the Fraudster Mindset
You also have to think about fraudsters — it's kind of facetious, but they don't have to deal with compliance or rules or regulations. And so their sophistication really knows no bounds in terms of the types of attacks that they can commit and their ability to actually go after things. And of course, in many cases, particularly if it's [...] phishing scheme, where they're trying to capture a credential, they only need one user to click on that link. So they can send out millions of emails, and as long as they get someone somewhere to click on a link, they can load that payload and then create further damage.
And so the bar is set very high for to protect [...], whereas with a case of the fraudsters, they only need a very, very minuscule percentage of people to do the inappropriate action and they can get access to systems. And so that's why, when we look at fraud, account take over or new account fraud are increasing, but the card not present continues to remain kind of a — when you look at survey statistics, that continues to be the dominant vector and, again, that's primarily in the e-commerce space where it's a digital transaction. I don't physically see you, I can't actually verify that you physically are in possession of the card, and I'm just looking at the number that's presented to determine if it's appropriate, and if it is, I'll allow the transaction to proceed. So this is an ongoing challenge.
I think we do see a growing interest in mobile fraud as well, given the push to more transactions over the mobile signal here. And this is just a comparison here. The traditional desktop fraud, where fraudsters have a heavier involvement. And again, just comparing some of the differences here in terms of what can or cannot be done in terms of defenses here. And this just reflects the challenges there, and part of reason why you see that push to mobile fraud, which will only continue going forward.
Other techniques that are emerging here is this card skimming, which you've probably seen in the news when people install it at a physical point of sale like a gas station pump or an ATM card, where they're actually using that to actually collect the [mag stripe] information that they can then use for later theft from that system. But there now are the equivalent of an online card skimmer. A mag card is an example of that, that basically can go out and do that. And last year, several high profile e-commerce vendors were actually victimized by this.
And so you can't just assume that because you only have digital transactions you're not immune to a card skimming attack. And in this case, the malware's placed on a checkout page and is used to capture credit card data, just like it's used at a physical point of sale terminal. So you need to be aware of that and understand that that will continue to be a problem that will plague organizations going forward.
And again, cross channel fraud, this is the kind of model where it's not necessarily just purely on one channel, it may be a mix. I will tell you that the call center continues to be an area where [...]. I think growing interest or concern in fraud, and that's because, again, the fraudsters have identified that the call center is an interesting target, purely because I can use that, victimize that side, purely with social engineering techniques. I don't need any fancy malware.
All I need is the phone and a kind of soothing voice. And I can go and, if I get a reassuring agent on the phone and tell them a sob story about my sick aunt who is in a coma and I need to transfer money from her account to pay the doctor's bills. If I get a sympathetic agent, I may be able to be successful with that.
And again, this isn't necessarily talking about stealing millions of dollars at a time, but I can do a few thousand dollars here or here or there, and over the course of a week, I can actually generate significant money doing that. And again, this is a social engineering that doesn't require a lot of sophistication, but still can be a very successful attack. And that's a vector that, again, we've seen growing incidents in that, because the criminals have realized that that's a very useful approach, particularly if you've got a good phone demeanor and are able to convince the agent to take a very specific action.
So when we look at what are some of the actual challenges on the data integration side, the growing need for accuracy. But with that becomes you need more data sources. And so this means that the need to create different graphs. And this is where, say, linking device data, say, if it's a mobile transaction, whether it's the number or specific hardware information about that device with the transaction, gives you better insight and confidence that this really is John Smith's phone that he's connecting with, and that's a similar phone he's used to connect with us in the past.
And so building out these profiles helps you better to detect bulk anomalous as well as fraudulent transactions. And anomalous is often where you'd see that false positive scenario emerge where the user is in fact still legitimate but for whatever reason — let's say they're in a different geography, it's a different time of day, it's a different type of transaction. And in a normal fraud model, that might get denied or flagged as being anomalous.
But if you have other data about that transaction, like the device, other user information, you may be able to say, well, this looks like the same user, and it just so happens that they're in a different area. So we may still review it, but we're not going to block it outright. And that can therefore improve overall customer experience. Because if the user really is trying to do that transaction in a remote location, they don't necessarily automatically get blocked.
Importance of Machine Learning
We can't really talk about fraud management without talking about machine learning. It is very much a part of where fraud management is headed in the move away from purely rules based models to ones that use machine learning and AI to sift through the large amounts of data to have more predictive and more evaluative approaches or fraud management. And there's obvious benefits here. Just understand, when we talk about [...] that you do need the models, but this does need tuning in real time based on changes to your environment.
And that with the continued tuning and moving towards machine learning, you can get to a point where you can start to de-emphasize and [...] traditional static rules based approaches that you may used in the past. But you also need to be wary of training data, because a lot of these systems require input of training data, which is generally historical data that you have. Which can be useful for initial step, but remember it is a what's happened in the past, and your user's behavior [...] may not be consistent in the future with what was in the past. So over-reliance on the training models can create issues in implementation in terms of yielding results that may or may not be consistent with actual behavior. So that's another thing to be on the lookout for.
I'm not going to go through each one of these. This is just a summary of a report that we did early this year in looking at what are some of the top trends shaping fraud management. Some of the things, I've already covered here, whether it's around machine learning, the user experience aspect. We haven't really talked a lot about blockchain.
That is a technology that is not fully in production yet, but certainly as a interesting technology, as a way to potentially alleviate some of these concerns, whether it's a distributed digital identity that you assert certain attributes and information about you that can then be — it's not repeatable. So it can be verified by a provider to ensure that it truly is you. That's something that banking institutions and others are paying attention to. We're still probably a few years away from seeing that in more broader scale, but that is definitely a trend that I think has real impact in fraud management going forward.
When we look at what are some of the vertical markets that are driving fraud management in addition to the traditional banking sector, which has always been the lead market. We talked about e-commerce sites. Peer-to-peer payment providers, they've been victimized by fraud.
Telcos, your wireless carrier may increasingly support mobile commerce and payments, and so they therefore are a potential now for being compromised by fraud. And you can also look within your supply chain, if you're a manufacturer. Procurement fraud or things like that is a real problem. It doesn't get publicized at the level as the consumer [stuff,] but that is something that companies with large supply chains are trying to get a better handle on as well.
Flexible and Collaborative Fraud Management
And it's also about — fraud doesn't need to be thought of purely as a defensive mechanism. Fraud does give you more ability to be more agile and responsive to changes in the business environment. So as you have better understanding of your users and their transactions, it allows you to potentially enter new markets that may have been deemed higher risk in the past and not necessary attractive to you, and does give you that flexibility. So it's important, and better allow you to demonstrate regulations, and it's not purely a fraud case. So it's important to think about fraud not purely as a preventive tool but as something which, if done correctly, can really promote overall agility and flexibility across your organization.
I think it goes without saying that fraud management is a [...] collaborative approach. You certainly need the leaders to drive the fraud management, but you need to have linkages and connections with, say, people in the business, people in marketing, to understand the requirements and how that may affect user experience or other initiatives you're working on that may suddenly change the fraud metrics you're collecting. So you need to be aware of that, and you need to be potentially looking with other security pros around the network or even with some of your people who are doing the application development themselves. Whether it's cloud-based or mobile-based, understanding what kinds of potential fraud mechanisms or signals could be leveraged out of that. So again, it improves your overall fraud detection capability once this stuff goes live.
To close it out with some of the recommendations, it goes without saying, data is a big part of it, particularly when it comes to training the systems and understanding how to build these models. The good news is most of you have that data available, so you can actually use that to assess where you're at. And then you need to integrate that data and understand how to leverage that across the business. You should also be looking at other things like risk based authentication, where, again, you're taking characteristics about the device and you're making a decision at runtime.
Essentially generating a risk score, saying, is this a device and an activity that's consistent with what this user has done with us previously? And if the score is consistent with past interactions, you may just allow the transaction to proceed. But if for some reason the interaction shows a slightly higher elevated risk score, maybe at that point, now you ask for some secondary response. Maybe you send a text of the device asking the user to confirm some information, send an email, maybe knowledgebase authentication. There's a range of options there that risk based authentication can provide that gives you higher confidence that the user who's interacting is in fact authorized or verified and therefore not fraudulent.
And I think the [...] piece I want to close on here is — I think these are things that we've touched on. Fraud management is not purely just for banking institutions. Certainly, banking institutions leverage a lot out of it, but there's plenty of organizations. If you are doing online business, fraud management matters to you. And many organizations aren't looking purely for a data feed, but they're actually looking for an overall system that can help them take that data feed and allow them to make better decisions in their business.
Because if they're able to do that, then they're better able to serve and retain their customers because it means they're going to design experiences that are consistent with what the user's expectations are. And at the same time, users are getting savvier about online security and are looking for services that they think are better protecting them and their data from misuse. And so there's an opportunity to take advantage of that growing consideration to make sure they're board with that.
Poor Fraud Management is Costly
And lastly, not to end on a somber note, organizations that struggle to comply with these kind of requirements do run the risk of being at risk, not just necessary for a breach but for negative customer perception about usability and overall ease of use, which over time can lead to reducing customer retention. And that actually costs the company money because that means that valued customers are walking out the door and you're losing the opportunity to engage with them. And so that's the benefit of looking at fraud management and applying those types of approaches to better protect yourself.
I realize I've gone through quite a bit here, but I hope this was helpful. I will be available to entertain questions. But in the interest of keeping us on track, I want to pass on over to Sam at Neustar. He can talk to you a little bit more about the Neustar approach.
But certainly, thanks for your attention and time today. And as I said, we can take some Q&A at the end. So thanks, again.
Great. Thank you, Merritt. My name is Sam Jackson. I am [...] management for risk solutions at Neustar, and in particular I oversee the development of new fraud prevention solutions mainly that are aimed at digital channels.
So Merritt did a great job of giving you an overview of fraud today, specifically the rapidly growing threat of fraud and some of the tactics that are emerging to deal with that threat. I'm going to talk about the evolving landscape of digital solutions and how brands can leverage identity in particular to fight fraud. And so I think all of us are familiar with the underlying key pain point that we address as fraud professionals, which is basically in one form or another the question of whether or not the person behind the device is truly who they say they are. And clearly this has important implications for operational efficiency and customer friction and allowing as many good transactions through as possible so that your bottom line continues to improve.
New Dimensions of Fraud Compliance
But it also has new dimensions for compliance, particularly with the emergence of CCPA in the United States and GDPR in Europe. And of course, as the attacks in digital channels scale up, there's more and more risk that if we don't mitigate fraud our bottom line is going to suffer from it. So as I said, again, I think the key question in digital channels is really identifying the person behind the device or assessing risk that might be associated with them.
Over the last few years, the last decade in particular, a very common approach emerged for dealing with this particular problem, and that's what's known in the industry as device reputation. So today's solutions are more advanced, but [...] time, device reputation really meant deploying device fingerprinting at scale. Which is basically a way of creating long term identifiers for devices so that if you see a device on a recurring basis you can identify it from one interaction to the next. And then using a network of customers, a co-operative, to collect feedback data about which devices have engaged in fraud, and then leveraging that data to yield some sort of risk score so that other customers in the network, if they see that device again, could have some sort of inkling of the threat level around that device or the actor behind that device.
And this has been a really popular approach, and in recent years it's driven a lot of consolidation in the fraud prevention space. In particular, last year we saw two of the largest identity bureaus, LexisNexis and TransUnion, go out and purchase device reputation vendors. LexisNexis, for example, bought threat metrics for north of $800 million. Obviously, that's a huge acquisition, and it tells you a lot about where the space is headed.
So if you think about these two companies together combined into one, LexisNexis is really an authority on physical identity — who people are in the real world, name, address, phone number, social security number, that kind of thing — whereas ThreatMetrix has this great network for collecting information about the risk of devices. And so when you bring these companies together, you see that there's probably a real strategic decision to try to leverage those physical identities in conjunction with digital identity to bring together some sort of holistic solution that really unlocks the power of identity in digital channels. And certainly they have their great offerings for this.
Focus on Digital Identity
At Neustar, digital identity has really been a major component of all of our offerings going back for decades now. We do a lot of work in advertising, digital advertising, we do a lot of work in the Telco space, [...] do a lot of work with cyber security. And many of our different offerings really revolve around identity resolution, and that's kind of the thing that helps stitch our different verticals that we operate in together.
And so in the fraud space, because we have all of this identity information because we've been leaders in identity resolution for so long, we are really uniquely positioned to answer the question of whether or not consumer identity data, physical information, can be tied back to a particular device. And so unlike reputational vendors which are more focused on the prior activities associated to a device that may or may not have been fraudulent, we are really trying to answer the question of who owns this device at the scale of the internet. Because we think that if you can get that right, then you can deal with a lot of fraud problems in a friction free, easy way that is silent and invisible for the customers and also mitigates the risks that these new threats [...] to bankers and other companies.
History of Neustar's Security and Risk Solutions
So I'm going to go into a little bit of background on who we are just to give you a bit more insight into how it is that we have these capabilities. As I mentioned, for a long time now, we've been leaders in the field of responsible identity resolution. We were founded in 1996 to fulfill a telecommunications contract, and since then we've gone out and purchased all kinds of different smaller companies in a variety of verticals. And we now operate primarily in digital advertising, cyber security, risk, and telecommunications. And most of the Fortune  brands are our customers, and certainly, the vast majority of the top 100 financial institutions are our customers.
We have offerings in digital advertising that really are oriented around the measurement of physical identity in conjunction with the device. So has the household seen ads on a smart television, which then led them to make a purchase on a tablet? If I have seen this series of ads across a variety of different devices, how can I stitch those together to measure the efficacy of that campaign so that we can help advertisers readjust their spend in the future? And then we also offer a variety of services around targeting and what we call onboarding, where we can actually take a file of physical identity information — so, name, address, phone number, that kind of thing — and match it to a cookie pool so that brands can share their customers advertisements on digital channels even if they don't already have that relationship to them.
On [...] the security front, we offer a lot of different services for things like DDoS mitigation. But importantly, in terms of having a lot of data around identity resolution, we also operate about 10% of the world's DNS networks. Meaning that we have unprecedented insights into the behavior of different IP addresses, which is obviously very valuable for fraud prevention. And on the telecommunications side, we operate the majority of the caller ID network in North America, and so that gives us a good set of insights into the relationship between telephone numbers and physical identities.
All of this is brought together in what we refer to as the OneID System, which is this really groundbreaking, unique repository of both physical and digital identity. It consists of records about the entire population of North America — or rather, the adult population of North America — [those] at the individual level and household level, as well as an enormous amount of impression data that comes in from our advertising network, such that we also see a lot of behaviors of these households and individuals on the internet. And it's all updated in near real time, large portions of it are rebuilt every 15 minutes. And it consists of really vast troves of information that help with a variety of different problems tied to identity resolution.
Improved Identity Resolution
So what does it mean when we talk about identity resolution for fraud prevention? Well, the most traditional and typical form of identity resolution that we see in fraud is what's known as verification. So typically, if we're talking about offline verification — which is, again, verification of the physical identity — a brand will query us with the name and the address, the phone number and email, maybe some other identifiers, and we'll look into the OneID repository and see how much information we have that corroborates the linkage between those different identifiers.
Does this telephone number go with this address, or has this name ever been seen in conjunction with this email? And also other dimensions, too, that are more related quality. Like, have we seen this name and phone number together from authoritative sources that are unlikely to have been tampered with, or are those relationships maybe a little bit more spurious? Or are there negative relationships such that those identifiers are actually tied to other individuals? Similarly, when we're talking about the physical world, phones are a big part of that, and so we have a lot of metadata about telephones and their behavior.
So is a phone mobile? Has it been prepaid? Is it VoIP? When was it last ported? These are important questions for understanding the risk level associated with the linkage between a telephone number and a broader identity.
So in recent years, I have been particularly focused on trying to produce similar services to this for digital channels. And so I'm going to talk a little bit about how it is that we can actually do this online to offline verification at scale. So in order to answer that question, I need to go into a little bit of detail on how it is that we collect data and a bit about how the advertising ecosystem works in this regard.
One of the things that we do in our interactions with ad tech vendors and our customers and publishers of advertising is [via] large volumes of what we refer to as linkages. Specifically, linkages between physical [identities] and device identities. So it may be that you go to a website to watch some video, you find that it's behind a paywall, or maybe you're just asked to register to see some content.
Oftentimes, the websites that host those types of paywalled content are also in the business of monetizing data, by turning around and sharing the identity of their users with certain respected data aggregators who use that information primarily for the measurement of [...] advertising. So you could imagine watching a sports video, entering your email on a major website to that effect, and then having that email associated with different identifiers are tied to your device, like a cookie ID or a mobile device ID or, certainly in pretty much every case, an IP address. And furthermore, if we're able to get individual resolution between a particular device and a physical identity, we can often do household resolution between IP address and household address.
Examples of Offline Identity Verification
So I'll give you a couple of scenarios that help to illustrate this. So in the case of a direct linkage, we may have somebody — we'll call them John — who goes and they want to watch some sports programming, so they log into their favorite website. The website's controlled by a major advertising publisher.
The advertising publisher has basically agreed to allow us to set a cookie on their device and collect a [hash] of their email address or maybe other PII. In that way, we're able to establish a direct linkage between their physical identity and their device identity. And furthermore, we have a whole lot of evidence in the OneID system that ties that specific email address to a physical address as well as a phone number and other things. And so over time we're able to build up a relationship between John's complete identity, including physical address and the devices he uses to browse the internet.
Now John may live with somebody else who might be more privacy conscious — we'll call her Meredith. Meredith perhaps uses an iPhone and therefore it's harder to track because she's on Safari browser that doesn't support cookies. Maybe she uses a throwaway email addresses when she's accessing content. You would think that she would be largely invisible to us, but because she shares a router with John when they're at their house and because the router has a relatively stable IP address that's not associated with a whole lot of other traffic, over time we're able to infer a linkage between John's household address and her particular device [...] their IP address. And so this is the difference between direct linkages and inferred linkages.
By collecting all this data, we're then able to turn around and start to do device based identity resolution. So you can give us an email and we can tell you if it's historically been associated to a device. We can do that with an address as well. It may be that the device has been linked to several addresses, either through sign-ups or logins or through being seen on a variety of different IP addresses that have been resolved to households. In those cases, you can provide an address where a person says they live or where they work and we can tell you the closest observed distance that we've seen between that device and that location.
Similarly, there are other characteristics that we have available, like the [tenure] on [our] network, which is really great for understanding signs of life and other behavioral signals that are helpful for sifting fraudsters from non fraudsters. And of course we can do this through IP address. And because the device may have been seen on a variety of different IP addresses over a period of time, we can often link that device to multiple households transitively and see if any of those households actually corroborate where a person says that they live.
So this is pretty interesting, and because we have this advertising network, it's an alternative to a lot of the sources that we see powering device reputation. Fingerprinting is another mechanism for ensuring some sort of identity persistence tied to the device. Over time, though, fingerprinting has become more challenging. Many fingerprints decay as aspects of a device change.
So for example, if you're using like plug-ins in a browser or fonts or things like that to sort of ascertain a fixed signal as to what a particular device is, it's possible that those characteristics will change over time. Meaning that the fingerprints will no longer be exactly the same. In this graph, you can see that, even on day 10, a lot of fingerprints — maybe as many as the majority of them, like 60% of them — have actually changed from when they were first collected.
Likewise, increasingly fingerprints are being defeated by browsers like Safari that are making it harder to get really unique signals. And as such, their utility is on the decline because they're oftentimes [less] [...] [than] they used to be. So it's harder to persistently identify one person uniquely, which means that oftentimes cookies in the old school technology actually have greater viability than fingerprints.
But fingerprints aren't without their challenges. And this is all getting harder to do those sorts of privacy headwinds that I'm sure many people are aware of. Browser fingerprinting is going away, but tracking cookies are as well. Safari and Firefox have moved to block third party cookies. But thankfully, Google has basically said that Chrome is going to keep them around, and Chrome currently controls about 65% of global browser share. So it appears that to some extent fingerprints are going to remain in play and to some extent tracking cookies are going to remain in play.
You might think that with all of this getting more challenging at the browser level, we can instead rely on mobile apps. But we're also seeing that, at least in banking, mobile adoption is starting to level off and plateau. So in the United States, for example, in [2018,] [...] [there] wasn't really any significant increase in the adoption of customers using mobile apps, and certainly a significant proportion of them choose to continue to use the browser.
So in light of all these challenges, one might ask, what else can be done at this point to overcome these things, through doing identity resolution. Well, at least in our case, we think the answer is graph search. Basically no single identifier — fingerprint, cookie, mobile advertising ID, or IP — is going to provide universal coverage.
And so really you need an approach that blends them all together. And that means large scale data collection — of course CCPA and GDPR compliant — device graphs — you can see how devices are related — and then the ability to fan out and search all of these different linkages at once in order to overcome coverage and ultimately build precise models that do resolve devices to identities.
There are some other approaches that are also viable in certain contexts. For example, there's a technology called header identification, where browsers and apps that are on a carrier network can actually be identified by the mobile network operators, the cellular carrier, such that a semi-anonymous browser session can actually be associated to a telephone number, which can then be used for similar sorts of device space verification. So, phone to name, phone to address, indicators like whether or not the device is prepaid or whether or not it's been ported.
Geospatial Data Plays a Role
Geospatial data also has a significant role to play here. In our case, because we have one of the largest advertising networks in the world, we often see devices on numerous different IP addresses. And so we can overlay that with GPS data that's been collected from monetized mobile apps — imagine like, your weather app or something like that — to try to see if either the individual has themselves been in a location that corroborates where they claim to live, or maybe it's been an IP address that has collected data at a location where they they've said they live. And so this is another set of signals that are oftentimes viable in cases where the device identity actually can't be resolved.
And finally, on this note, I'll say that oftentimes there are anomalous indicators that are present in cases where none of these other approaches work. So maybe there's no cookie, the fingerprint is not unique, you're not in a mobile app, you don't really know much about the device. And so it's really an ambiguous [...]. But nonetheless, because fraudsters are highly motivated to disguise their identity, there may be other indicators around as well around the anonymity of the IP address that actually give away the fact that something is wrong.
So in our case, we have indicators about the realness of an IP, which is basically a score as to whether or not the IP is likely to be associated to a real human organization, like a company or a family, or whether it's something more anomalous like a bot farm or server farm. Other characteristics of the IP are useful, like whether or not it's tied to a hosting facility, like a data center, or [co-location.] Clearly, that's not typical of a real world user. And other things, too, like whether or not the user's sitting behind a proxy like a VPN network or the Tor network or something like that.
In banking scenarios, we tend to see that when these flags are present, fraud rates are as high as 1 in 4 or 1 in 5. And then, in our case, we also see a lot of interesting browsing behavior, ties to IP addresses. And so, for example, if we see IP addresses constantly going to online gambling websites or frequenting lots of different payments companies, those may be indicators that the person is not really typical of a low volume household but it's somebody who is maybe trying to launder money or do identity fraud at scale.
So to summarize, I'll try to put this together for you in terms of how these play out as good and bad scenarios. A safe scenario is probably one where the personal identity verifies in terms of offline resolution. There's maybe some linkage between the digital identity and the physical identity. There's a long tenure on our network, meaning that they had a cookie or some sort of relationship that dates back a while, so they're probably not trying to disguise their identity.
Maybe there's geolocation that corroborates where they say they live, within a short distance like a quarter or a half kilometer. And then the phone is also a good indicator. Like, if it's a more expensive phone and it's not prepaid, fraud rates just tend to be lower because very few people are turning to like a iPhone X to do something where anonymity is crucial and you may have to dispose of the device at some point.
A high risk scenario would be one where the PII elements don't verify, there's no ability to corroborate the identity either through actual device identity or [geo] [...]. Maybe there are IP anomalies present like a proxy or VPN network. Or maybe the phone itself is linked to the identity but it's been ported recently, so you no longer know that there's any certainty behind that relationship between the phone number and the physical identity. And then certainly in cases where the phone is prepaid and cheaper, oftentimes indicate that something is amiss and maybe an individual should be appended for manual review, since they have the kind of characteristics that fraudsters do when they go out of their way to disguise their identity.
So we're running out of time, and I want to be cognizant of room for questions. So I'm going to conclude there. And Mike, I'll go back to you at this point if you want to bring up some questions from the audience.
Excellent, excellent. Thank you, Sam. And, yeah, we've got some good questions in here, and wanted to just encourage anyone that's got something on their mind to get that question in. Now would definitely be the time.
How do we reconcile new device purchases?
But let's start with this first question. It seems like a lot of what Neustar is doing is tying someone's identity to a physical person's device, a phone or a tablet. How do we reconcile new device purchases, when Apple customers get a new phone every year?
Yeah, that's a good question. So certainly the case of Apple is one where it's much harder to do identity resolution. And that's by design. The one good thing I'll say is, again, reiterate the point that fraudsters rarely use very expensive devices when they're going out and perpetrating fraud. And that's because they need to cover their identities and have the ability to [dispose] of something after the fact.
But just a device alone is not the only way to do this resolution. Again, IP address is really useful. If they're at home, there's a good chance they're also using other devices to connect, and that we've seen a low volume of activity tied to that particular IP address and other linkages that have come in.
If you're on a cable connection or something like that, your IP address tends to be relatively constant. It may rotate every month or two, but oftentimes they'll rotate between the same IP addresses. And so if your kids or your loved ones are also browsing the internet and using their emails or other identifiers to access content, there's a good chance that we've been able to resolve that particular IP to a home location. Likewise, if you're using mobile apps that rely on GPS, and those are effectively leaking your location out to the advertising ecosystem, there's a good chance that we can corroborate the location of that particular IP address even in the absence of any kind of device identity.
Finally, if you're on the go and you're linked up to, like, a carrier network, mobile header identification is in play. And typically regardless of the privacy controls that Apple has in place, we can still get back a telephone number associated with the browser or the mobile app, such that we can then look that up in the OneID repository, get back a bunch of physical identity information, and ultimately verify that identity. So certainly, despite these privacy headwinds, there's still a lot in play that we find useful.
Excellent, excellent. Thank you, Sam. All right, here, next question.
How can we monitor devices with device graphs?
Sam, you mentioned that your solution uses device graphs to monitor devices that belong to the same households. Can you go into a bit more detail about how that works? If you see my device with my co-worker, how do you know we aren't part of the same family?
Yes. So this is one of the most interesting aspects of how identity resolution [...] in the fraud ecosystem or for digital advertising. The key to linking devices together is really the co-occurrence on specific IP addresses. So if we see the same device identifiers repeatedly on an IP, we can soft link them together and say that these people are in the same vicinity. That's assuming that the IP is not, like, tied to a mobile gateway or something like that.
Now if the IP is a low volume IP, where we don't see a lot of traffic, then there's a good chance it corresponds to a household or a small office. Now, let's say that these devices are linked together at a small office, how would that differ from a home? Well, typically it's going to be different because of the hours in which the devices are active together. So an office typically is going to have 9:00 to 5:00, or it's going to have a lot less activity overnight. Whereas at home, you're probably going to be most active in the mornings and in the evenings and on the weekends.
So that's one way of distinguishing home IP from other types of networks. Obviously, looking at routing info, like who the IP block resolves to, is also useful for that. Is it Comcast or Cox or Verizon or so forth?
And then finally there's the co-occurrence of devices on different IPs over time. So if you're traveling with your loved ones, you may check into the same hotels, you may be in the same locations. And so building up this body of observations that tie these things together and then overlaying the temporal analysis that maps back to patterns of human life, these are really kind of the keys to the data science that goes into identity resolution for household based device graphs.
How does this approach combat synthetic identities?
Thank you. All right. All right, we're getting really close to top of the hour here, but I think I can sneak in one more question. Does this approach do anything to combat synthetic identities?
Well, I'll touch on this really quickly, and then maybe Merritt has something to say about this. I think the thing that's great about our data is that it really is behavioral. We're looking at people's activities that they just do, normal lives on the internet.
And so that gets to the question of signs of life. Like, is there something about a particular identity that really is corroborated behaviorally in a way that wouldn't be true if the identity was completely virtual? As a quick reminder, synthetic identities typically are made up people that circulate in credit header data, and so therefore you wouldn't really expect much in the way of signs of life associated with them. But Merritt, you may have more of a perspective on this based on all the work that you do.
Yeah, sure. And for those who aren't familiar, synthetic identity is a concept of, as Sam indicated, an identity that contains a mix of legitimate and made up information. So it could be a legitimate Social Security number and other information that's erroneous. And again, this is a reflection of the fact that fraudsters have access to greater and greater data sets now.
So the ability to create these synthetic identities is much easier. And it also makes it harder potentially to detect because you're combining real and fake data into a single entity, which may put a real pressure on a system's ability to identify that. And you can even think of, even as individuals, you can end up in scenarios where you yourself essentially have a synthetic entity.
Take an example, where you move. So suddenly, some parts of your identity are the same, like your social security number, date of birth. But other components, like your home address, zip code, home phone number, have changed. That's a perfectly legitimate scenario that happens to citizens all the time.
The challenge there is that also can look like a synthetic identity. So now you have the challenge of combating that. In your efforts to stop that, you don't want to create scenarios where people who actually have moved aren't now blocked out from being able to use your systems because they're flagged now as data not looking legitimate. So this puts a real premium on making sure that the way you assess and evaluate these data is done in a way that you hopefully have high accuracy, high fidelity, and therefore can eliminate that kind of user friction that would occur against [value users] who somehow have managed to have parts of their identity attributes change over a period of time.
Excellent, excellent. Thank you, Merritt. I think that's a perfect note to wrap up on here. We're right just about at the top of the hour, and we always do try to stick to our schedule, knowing that so many folks are running off to something else at the top the hour.
So as I suspected, we had so much good material to get through here today that we weren't able to get to everyone's questions. But as I said, we will definitely be following up with folks afterward that asked questions we didn't get to today. So thank you so much for asking those questions and being here with us today.
And also please keep an eye out. We will be sending you a follow up email in the next day or two that will have a link to a recording of this event, as well as a link to the slides themselves so you can review all this material at your leisure in the future. And we encourage you to do that.
And so let me just also thank Sam and Merritt both. Really great discussion, nice content, and really just, as I said, great, great material here to get through. So everyone, have a wonderful rest of the day. And I hope you'll join us again here very soon. Thanks so much.