The Clearing House - Corporate Changes in Payment Practices - September 2023

Synthetic Data Gives Firms Real Results in Fighting Fraud


Imagine being able to train your defense systems on cyberattacks that haven’t even happened yet.

Far from being farfetched, that reality is now fast becoming a necessity for today’s firms.

The future of cybersecurity is increasingly being shaped by the use of generative adversarial networks (GANs) for generating synthetic data, which is then used to train fraud detection models, and other systems, on entirely new types of attacks by fine-tuning them against new criteria they’ve never historically encountered.

In the face of a rising next generation of highly sophisticated fraud attacks and financial scams that deploy novel methods leveraging generative artificial intelligence (AI) and deepfake manipulations, GANs can be chalked up as a win for the good guys.

And it is a needed one. Today’s attack vectors are more complex, dynamic, and instantaneously scalable — which is why many organizations have found themselves on their back foot trying to defend against modern cybercrimes, particularly if they are stuck using legacy defenses or relying too much on manual, human-first processes.

So how do GANs work — and where can they be applied?

The short answer is, nearly everywhere there is an existing vulnerability and limited historical data.

Read also: Why Whack-a-Mole Risk Prevention Won’t Work in Today’s Data Economy

GANs Represent the Cutting Edge of a Strong Defense

GANs are deep learning neural networks that consist of two parts, a generator and a discriminator. These two parts work in competition with each other, and that’s where the “generative” and “adversarial” nomenclature of the technology comes from.

The generator creates synthetic data, and the discriminator attempts to catch the synthetic data. The two elements continually chase each other, learning and improving together in step over time until the discriminator cannot identify the synthetic data as fake. The result is that GANs allow organizations to create usable simulations of data sets that did not exist before.

The synthetic output from a GAN can be purpose-built to train anti-fraud and other key models, making them better equipped to handle new and forecasted scenarios that they otherwise would have been unable to be trained on.

After all, the more data — and more structured — a data model has at its disposal, the better it will be at doing its job.

GANs can also be used to help financial institutions and lenders train credit scoring models on a much broader set of data than would typically be available, helping lead to more accurate credit decisions; boost anti-money laundering (AML) detection programs; and help businesses assess risk in areas where they have limited historical data.

“In the payments security space, just as in the consumer space, you’re seeing a massive investment in AI. It’s a buzzword — but it has to be, as bad actors evolve in the ways they are attacking businesses,” Jeff Hallenbeck, head of financial partnerships at Forter, told PYMNTS.

Read more: CFOs Use Automation for Fraud Prevention More Than for AP/AR

GANs Gaining Ground 

GANs can generate synthetic data helps improve the model’s ability to recognize fraudulent patterns by providing it with more examples of both legitimate and fraudulent transactions.

GANs can also be used to generate fraud patterns by training the generator to create synthetic fraud scenarios. This can be helpful in understanding how fraudsters operate and devising countermeasures to prevent such attacks.

This is increasingly important, because PYMNTS Intelligence shows that fraud at financial institutions is on the rise.

Tobias Schweiger, CEO and co-founder of Hawk AI, told PYMNTS last month that, “the application of [AI] isn’t just reserved for the good guys … and bad actors are accelerating what I would call an arms race, using all of those technologies. As a financial institution, one has to be aware of that accelerated trend and make sure your organization has enough technology on the good side of the equation to fight back.”

Another application of GANs is to generate synthetic credit profiles to help train credit scoring models on a broader set of data than just a financial institution’s own traditionally accessible information. This can help lenders appeal to more borrower segments, including individuals and businesses with thin credit files, while at the same time reducing lending risk and helping inform more personalized credit solutions.

Still, using synthetic data generated by GANs must be done carefully to avoid privacy and ethical concerns, especially when handling sensitive payment information.

And while GANs can be a valuable tool in the fight against payment fraud, they should be part of a broader fraud detection strategy that includes other machine learning and traditional methods to ensure comprehensive protection against evolving fraud techniques.