The Data Layer for Fraud: A Better Mousetrap
Calling modern fraud prevention ‘a game of cat and mouse’ hardly captures the full scope of the problem. The mouse of today is waging a campaign of asymmetric information warfare, often turning our tools against us. Data breaches have rendered yesteryear’s mousetrap obsolete. Need a fake ID? Anyone can buy highly realistic documents using alternative currencies like … Amazon gift cards. IPs can be anonymized, mail intercepted, money laundered, KBAs answered, all with the ease and convenience of online shopping.
As threats grow in sophistication and prevalence, organizations must evolve new strategies to cope. Fraud prevention operations span many levels – decision systems, predictive analytics, manual review processes and most crucially, real-time information services.
Enter the data layer
A truly holistic view of potential customers allows organizations to cut through the anonymity of the internet and sort low-risk users from high-risk individuals with ease and without discernable friction for the end user. The key is information – enough to successfully corroborate a claimed identity while simultaneously detecting anomalous behaviors that indicate an attempt at deception.
To build a well-rounded data layer, organizations should focus their strategic investments on datasets that mirror who we are in the real world. One or two datasets are not enough. Each methodology has weaknesses or coverage gaps.
Leverage the following categories of intelligence
While simple relationships—between a name and an address, or a name and a phone number—are no longer enough to verify an identity, it is still true that most people live in a particular place, use a particular phone, and maintain a variety of emails. These relationships form a strong bedrock of identity in the real world. Of course, it may be a stolen identity. That’s why this particular view of a customer is not enough on its own. Supplemental signals, like Synthetic ID risk, help round out this strategy.
In the third decade of the 21st century, it is now possible to verify that a given physical identity has been seen online using a particular device. This data is typically aggregated from merchants or other service providers who have transacted with a given individual on a particular device and can thus corroborate a pre-existing relationship between the two. You don’t have to treat all unfamiliar devices as anonymous – their identities can be confirmed by other parties. That said, these solutions struggle with coverage. With browsers and operating systems moving towards greater privacy controls, many solution providers’ match rates languish below 50%. Furthermore, relationships can be spoofed. In the future, you’ll have to leverage solutions that can characterize the quality of the device-to-identity relationship.
Additional qualitative signals include the tenure of a given identity on a device and the association of a device to other members of a family. It’s too costly for fraudsters to fake a complex, well-aged profile of an online life, given that it may not be enough to get them around fraud controls.
IP Address Characteristics
IP addresses are dangerous for fraudsters. While they seem anonymous, they are often used by law enforcement to unmask identities. For this reason, criminals mask their true IPs with commercial VPNs and proxy networks like TOR. Less common but no less dangerous are IPs tied to known data centers or server colocations – after all, normal people don’t live on server racks.
The key is to leverage threat-detection solutions that map out these techniques at the IP level. Treat anomalies with extreme caution.
The GPS capabilities in your phone that allow you to search for nearby restaurants or look up directions are sold to data aggregators. That means they are available to your fraud practice. Look to these geospatial observations to see if a customer’s device, IP, or IP history shows them in the vicinity of their purported home, or near a package’s shipping address.
Today, granular observational data can enable geospatial corroboration within a matter of meters. Combine that with a device’s IP history, and more than 50% of all customers can be corroborated within a very narrow radius of their homes.
This approach isn’t perfect. For example, fraudsters may target their neighbors for mail interception. That said, more often than not, you’ll find that legitimate customers’ devices orbit in the vicinity of their homes, while fraudsters’ devices are often hundreds of miles away.
Given how many companies rely on text-based one-time-passcodes (OTP) for multi-factor authentication, it is essential to understand the current ownership of a telephone number. Fraudsters that successfully reassign a phone number or swap a victim’s SIM card number will receive the OTP meant to protect consumers’ accounts. Account takeover tied to these tactics has led to some of the costliest fraud incidents in recent years. You can fight back by incorporating information about the current ownership status of a telephone number into your fraud engine.
Details about the phone can be very predictive of identity risk. “Burner” phones, pre-paid accounts, and numbers associated with VOIP and MVNO virtual lines are all high risk. They are more disposable than phones tied to traditional carrier contracts and allow for greater anonymity. It pays to monitor for indicators of these vectors.
Fraudsters gravitate toward cheap devices because they may need to dispose of them frequently. In general, the price of a phone used in a transaction is inversely correlated to fraud rate (we rarely see much fraud on an iPhone X or other expensive models). Similarly, servers used in place of consumer devices are big red flags to watch out for. In many cases, device characteristics concerning make and model can be obtained within a web session, simply by parsing the “user-agent” that comes with each request from a customer’s browser.
Given how much user-agent can tell us about a device, fraudsters have taken to forging user-agent to impersonate different devices. We can turn this problem to our advantage by looking at discrepancies between the impersonated user-agent field and the device’s underlying attributes: for example, installed system fonts, clock speed, or rendering algorithms. Many device-fingerprinting packages include these sorts of mechanisms for detecting bots or impersonation attacks.
Offsite Behavioral Data
Fraudsters tend to behave differently online. While we pay bills, read the news, dabble in some online-dating, and check email more often than is healthy, fraudsters might visit gambling websites to launder money. Likewise, people engaged in identity fraud often are involved with other criminal pursuits, like advertising fraud. For this reason, insights into a device’s offsite-visitation patterns can provide a significant lift to your fraud models.
Offsite behavioral signals tend to be highly abstracted. Most service providers with access to this kind of behavioral data are very motivated to maintain the online privacy of internet users. Typically, these signals are available as risk scores or anonymized vectors representative of browsing activity. That’s fine. We don’t need to know the exact sites our customers visit as long as we have a way of classifying the risk levels of the underlying activities.
Ignore the inscrutable name; header identification is a powerful tool for verifying consumer identities. Most mobile carriers offer commercial services that silently verify the association between a telephone number and a mobile device. The data is entirely deterministic, and the service is unobtrusive to the end user. It is a great signal when it is available. Sadly, this capability only works when devices are on a carrier network, not wi-fi. Coverage isn’t great.
During the last decade, merchants have banded together with data service providers to form anti-fraud consortiums. Fraud platforms act as intermediaries so that merchants and other organizations can anonymously share information about devices tied to fraud incidents. Consortium members’ feedback data is aggregated into risk scores. Even though sophisticated fraudsters readily spin up new virtual devices or hide their IPs, device reputation works often enough to warrant investment. The approach is also useful for detecting first-party fraud. Some consumers may consistently claim that packages—all ordered from the same device—are lost in the mail, driving up costs associated with chargebacks.
In recent years, behavioral biometrics have emerged as one of the most predictive indicators of fraudulent behavior. All of the data comes from your organization’s web properties or mobile apps. These techniques flag anomalous behavior by watching UI interactions from users in the run-up to a transaction. For example, a fraudster might target a list of individuals she finds in a data breach. She will copy-paste personally identifiable information into a web form, something no legitimate customer would ever do.
Criminals try to commit fraud at scale. They cut corners for the sake of efficiency and repeatability. Their subtle but significant differences in online behavior often serve as the canary in the coal mine.
Uncover more fraud in the data layer
Fraudsters exploit a remarkable range of tools to target our businesses. If you can see the intruder ahead of time, you can can shut the door in his face. Stay ahead of the curve. Invest in building out a healthy, multifaceted data layer. The more data attributes that your fraud engine digests, the more differences that it will uncover, and the more fraud it will stop. Armed with a complete view of customer identity, you can ensure that intruders have little place to hide.