FICO: Industry Needs More Data To Stop Cybercrime

As lawmakers consider measures for fighting payment data breaches, they need to consult with the people who are already in the fraud-fighting loop — and make sure they get the data they need, according to FICO VP of product management Doug Clare in a post on the credit-scoring company’s blog.

“It’s a positive sign that lawmakers recognize this need. Information sharing has become one of the fundamental tenets of cybersecurity being discussed by regulators on Capitol Hill,” Clare wrote. “But the data we need is more than what is being proposed in some regulatory plans.”

That means using large quantities of essentially raw transaction data — stripped of personally identifiable information, but still including times, frequencies, and at least some metadata about traffic flows — so antifraud researchers can recognize patterns and establish baseline behaviors.

That, in turn, can be used to build models for non-fraud behavior, so suspicious activity will stand out. “We don’t need all of the raw data to do our work, but if the data is too processed the models built on it will suffer,” Clare wrote.

The good news is that Congress is getting that word from both FICO and others. At a House Commerce subcommittee hearing on March 3 on cyberthreats, Greg Shannon, the chief scientist for Carnegie Mellon University’s security program CERT, told the lawmakers that “researchers and developers need access to everyday data so that they can begin to recognize what datasets are important. If the research community were able to successfully determine which features in datasets were essential to combating the cyberthreat, then in effect, over time less data would need to be shared to productively handle cyber risks.”

One way to accomplish that would be with a consortium approach, like the one FICO uses for its own fraud detection models, using data from 9,000 banks worldwide, according to FICO’s Clare. “Contributions to the consortium will allow patterns to be determined based on current cyber attack attempts, so that we can quantify, qualify and rank threats in terms of impact,” he wrote. “This creates a continuous learning loop — actionable insights go to our clients via the updated analytics, and more data flows back to the models via the consortium. This provides a broader and more detailed view than any one installation could obtain on its own.”

As much of an improvement as that would be, there’s a serious question of whether Congress would approve that kind of data gathering in a post-NSA-leaks political environment. At the moment, legislators are still trying to agree on a 30-day requirement for reporting data breaches.