AI systems are susceptible to malicious actors introducing them to corrupt data, a tactic referred to as ‘poisoning attacks,’ as outlined by one of the authors of a recent study conducted by the U.S. government.
The analysis by the National Institute of Standards and Technology delved into the cybersecurity risks posed to AI systems, particularly amidst mounting apprehensions regarding the security and dependability of generative AI in light of the upcoming 2024 election cycle.
Alina Oprea, a professor at Northeastern University and co-author of the study, emphasized that these attacks are relatively straightforward to execute, necessitating only minimal familiarity with the AI system and limited adversarial capabilities. For instance, poisoning attacks can be initiated by manipulating a small number of training samples, constituting a tiny fraction of the overall training dataset.
By corrupting AI systems utilized in news aggregation or social media platforms, adversaries could effectively disseminate misinformation or propaganda, as highlighted by Eyal Benishti, the CEO of cybersecurity firm Ironscales, in an interview unrelated to the report.
Furthermore, adversaries could compromise AI systems to generate unreliable or detrimental outcomes, eroding trust in these systems. This could have severe repercussions in critical domains such as finance, healthcare, or government services.
AI systems, ranging from autonomous vehicles to medical diagnostics and chatbots, are increasingly integrated into our daily routines. These systems acquire their functionalities through the analysis of extensive datasets. For instance, self-driving cars are trained using images of roads and traffic signals, while chatbots leverage vast datasets of online dialogues to formulate appropriate responses in diverse scenarios.
Nonetheless, the integrity of the data employed to train these AI systems is a significant point of concern. Often sourced from websites and user interactions, this data is vulnerable to manipulation by malicious entities.
This risk persists during the initial training phase of AI as well as during its ongoing adaptation and learning from real-world interactions. Such interference can lead to undesired AI behaviors; for example, inundating chatbots with malicious data may increase the likelihood of generating erroneous responses.
Even minor undetectable inaccuracies deliberately introduced during the model’s training can disrupt calculations or projections, cautioned Arti Raman, the CEO of Portal26. Intentionally training the model inadequately for minor tasks could significantly alter outcomes. Large datasets containing intentionally erroneous information could cause even more harm, albeit being relatively easier to identify.
This underscores the potential chaos that malicious individuals or state actors could incite by injecting false information and data into the AI programs of foreign entities or political adversaries, potentially resulting in catastrophic or even fatal consequences across defense systems, response mechanisms, communications, workflows, supply chains, finance, and other critical areas.
Safeguarding AI Against Corrupt Data
Mitigating data poisoning attacks poses a formidable challenge, experts contend. Many defense strategies against these attacks rely on extensive language models that serve as pre-filters for prompts, observed Jason Keirstead, the vice president of Collective Threat Defense at Cyware.
He further stressed the importance of verifying the authenticity of AI outputs by referring back to the original sources before proceeding with publication or assessment if concerns arise regarding their validity. The quality of the training data is paramount, yet solving this issue is complex due to the vast amount of information required to train these models effectively.
Nicole Carignan, the vice president of Strategic Cyber AI at Darktrace, emphasized the necessity of embedding AI security measures throughout every stage of an AI system’s development and implementation. Organizations should establish red teaming plans to evaluate models, access points, APIs, and vulnerabilities in training data.
Additional considerations encompass enhancing data storage security, enforcing data privacy controls, implementing access controls for data and models, defining security policies for AI interactions, deploying technology to detect and respond to policy breaches, and devising strategies for continuous Testing, Evaluation, Verification, and Validation (TEV and V).
“Comprehending the evolving threat landscape and the tactics adversaries employ to manipulate AI is crucial for defenders to assess these scenarios against their own models effectively and secure their AI systems,” Carignan added.