The U.S. government has unveiled new security guidelines aimed at bolstering critical infrastructure against artificial intelligence (AI)-related threats.
“These guidelines are informed by the whole-of-government effort to assess AI risks across all sixteen critical infrastructure sectors, and address threats both to and from, and involving AI systems,” the Department of Homeland Security (DHS) said Monday.
In addition, the agency said it’s working to facilitate safe, responsible, and trustworthy use of the technology in a manner that does not infringe on individuals’ privacy, civil rights, and civil liberties.
The new guidance concerns the use of AI to augment and scale attacks on critical infrastructure, adversarial manipulation of AI systems, and shortcomings in such tools that could result in unintended consequences, necessitating the need for transparency and secure by design practices to evaluate and mitigate AI risks.
Specifically, this spans four different functions such as govern, map, measure, and manage all through the AI lifecycle –
- Establish an organizational culture of AI risk management
- Understand your individual AI use context and risk profile
- Develop systems to assess, analyze, and track AI risks
- Prioritize and act upon AI risks to safety and security
“Critical infrastructure owners and operators should account for their own sector-specific and context-specific use of AI when assessing AI risks and selecting appropriate mitigations,” the agency said.
“Critical infrastructure owners and operators should understand where these dependencies on AI vendors exist and work to share and delineate mitigation responsibilities accordingly.”
The development arrives weeks after the Five Eyes (FVEY) intelligence alliance comprising Australia, Canada, New Zealand, the U.K., and the U.S. released a cybersecurity information sheet noting the careful setup and configuration required for deploying AI systems.
“The rapid adoption, deployment, and use of AI capabilities can make them highly valuable targets for malicious cyber actors,” the governments said.
“Actors, who have historically used data theft of sensitive information and intellectual property to advance their interests, may seek to co-opt deployed AI systems and apply them to malicious ends.”
The recommended best practices include taking steps to secure the deployment environment, review the source of AI models and supply chain security, ensure a robust deployment environment architecture, harden deployment environment configurations, validate the AI system to ensure its integrity, protect model weights, enforce strict access controls, conduct external audits, and implement robust logging.
Earlier this month, the CERT Coordination Center (CERT/CC) detailed a shortcoming in the Keras 2 neural network library that could be exploited by an attacker to trojanize a popular AI model and redistribute it, effectively poisoning the supply chain of dependent applications.
Recent research has found AI systems to be vulnerable to a wide range of prompt injection attacks that induce the AI model to circumvent safety mechanisms and produce harmful outputs.
“Prompt injection attacks through poisoned content are a major security risk because an attacker who does this can potentially issue commands to the AI system as if they were the user,” Microsoft noted in a recent report.
One such technique, dubbed Crescendo, has been described as a multiturn large language model (LLM) jailbreak, which, like Anthropic’s many-shot jailbreaking, tricks the model into generating malicious content by “asking carefully crafted questions or prompts that gradually lead the LLM to a desired outcome, rather than asking for the goal all at once.”
LLM jailbreak prompts have become popular among cybercriminals looking to craft effective phishing lures, even as nation-state actors have begun weaponizing generative AI to orchestrate espionage and influence operations.
Even more concerningly, studies from the University of Illinois Urbana-Champaign has discovered that LLM agents can be put to use to autonomously exploit one-day vulnerabilities in real-world systems simply using their CVE descriptions and “hack websites, performing tasks as complex as blind database schema extraction and SQL injections without human feedback.”