Why Europe Must End Its 30-Year Digital Winter to Ensure Its Long-Run Future

NIST Says Defending AI Systems From Cyberattacks ‘Hasn’t Been Solved’

National Institute of Standards and Technology, NIST

New technologies bring with them new opportunities — and new threats.

As we enter the new year, many eyes are on the transformative potential of artificial intelligence (AI).

Some of those eyes may belong to bad actors looking to take advantage of the attack vectors endemic to new generative AI systems. 

This, as the U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) on Thursday (Jan. 4) released a “Trustworthy and Responsible AI” report identifying the four types of cyberattacks that can manipulate the behavior of AI systems, as well as key mitigation strategies and their limitations.

The government agency has been charged with developing domestic guidelines for the evaluation of AI models and red-teaming; facilitating the development of consensus-based standards; and providing testing environments for the evaluation of AI systems, among other duties. NIST has found that AI systems are increasingly vulnerable to attacks from bad actors that can evade security and even prompt data leaks.

“Despite the significant progress AI and machine learning have made, these technologies are vulnerable to attacks that can cause spectacular failures with dire consequences,” NIST computer scientist Apostol Vassilev, one of the publication’s authors, said. “There are theoretical problems with securing AI algorithms that simply haven’t been solved yet. If anyone says differently, they are selling snake oil.” 

“We are providing an overview of attack techniques and methodologies that consider all types of AI systems,” Vassilev added. “But these available defenses currently lack robust assurances that they fully mitigate the risks. We are encouraging the community to come up with better defenses.”  

Read alsoHow Year 1 of AI Impacted the Financial Fraud Landscape

The Threats of Adversarial Machine Learning

The NIST report categorizes potential adversarial machine learning attackers into three categories: white-box hackers, sandbox hackers, and gray-box hackers. 

White-box hackers have full knowledge of an AI system, sandbox hackers have minimal access, and gray-box hackers are somewhat informed about an AI system but lack access to its training data.

All three types of bad actors can do serious damage, the NIST warned.

“Fraud is growing, and the recipes are getting slicker,” Gerhard Oosthuizen, chief technology officer at Entersekt, told PYMNTS. “At this stage, the technology has led to more challenges in the fraud space than potential wins.”

The challenge — and risks — are only growing as AI continues to permeate more elements of our connected economy.

See also: How AI Firms Plan to Build, Then Control, Superhuman Intelligence

The NIST report found that AI systems can malfunction when exposed to untrustworthy data, and that attackers are increasingly exploiting this problem through both “poisoning” and “abuse” attacks.

AI system poisoning involves introducing corrupted data into an AI system during its training phase. The NIST gives as an example a bad actor slipping numerous instances of inappropriate language into conversation records, so that a chatbot interprets these instances as common enough parlance to use in its own customer interactions. 

AI system abuse attacks, per the report, involve the insertion of incorrect information into a source, such as a webpage or online document, that an AI then absorbs. Unlike poisoning attacks, abuse attacks attempt to give the AI incorrect pieces of information from a legitimate but compromised source to repurpose the AI system’s intended use. 

“Most of these attacks are fairly easy to mount and require minimum knowledge of the AI system and limited adversarial capabilities,” said NIST report co-author Alina Oprea, a professor at Northeastern University. “Poisoning attacks, for example, can be mounted by controlling a few dozen training samples, which would be a very small percentage of the entire training set.” 

One of the main problems with AI defense is that it is notoriously difficult to make an AI model unlearn a taught behavior — even if that behavior is malicious or damaging.

The other two adversarial machine learning attacks to be aware of include Privacy attacks and Evasion attacks.

Privacy attacks, as the name implies, are attempts to learn sensitive information about the AI or the data it was trained on in order to misuse it. Per the NIST report, bad actors can ask a chatbot numerous legitimate questions, and then use the answers to reverse engineer the model so as to find its weak spots — or guess at its sources.

Evasion attacks occur after an AI system has already been deployed and attempts to change how the system responds and reacts to traditional inputs. As highlighted by the NIST, this can include things like adding markings to stop signs to make an autonomous vehicle misinterpret them as speed limit signs, or creating confusing lane markings to make the vehicle veer off the road. 

While no foolproof method exists yet for protecting AI, following basic cyber hygiene practices can help avoid potential abuse. After all, keeping a tidy house never goes out of style.

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.