AI Systems as Attack Surfaces
AI systems deployed in Canadian government and critical infrastructure are targets for adversarial attacks — prompt injection, data poisoning, model tampering, supply chain compromise — that can manipulate their behaviour and compromise the decisions they support.
AI systems deployed in Canadian government, critical infrastructure, and commercial services are themselves targets for adversarial attacks. Unlike cyberattacks that use AI as a tool, this hazard concerns attacks directed at AI systems to manipulate their behaviour, extract sensitive information, or cause them to produce harmful outputs.
The attack surface of AI systems includes several distinct vectors, each affecting different system architectures:
Prompt injection is the most immediate and widely demonstrated threat to LLM-based systems and AI agents. Attackers embed malicious instructions in content that AI systems process — hidden text in websites, documents, or databases — causing the AI to act against the user's intentions. AI agents that browse the web, process emails, or access external databases are especially vulnerable because they encounter attacker-controlled content as a normal part of their operation. This vector is most relevant to the growing adoption of LLM-based tools across Canadian government for document processing, citizen services, and internal workflows. NIST has begun evaluating agent-hijacking risks through prompt injection.
Data poisoning involves corrupting the data that AI systems rely on, and threatens any machine learning system — including traditional classifiers, scoring models, and LLM-based systems alike. Poisoning can occur during initial training or during retrieval-augmented generation (RAG), where systems consult external databases to inform their responses. Poisoned data can introduce systematic biases, factual errors, or hidden behaviours that are difficult to detect and may affect all downstream users. Existing government AI systems such as IRCC's immigration triage and CBSA's border risk scoring are susceptible to this vector regardless of their underlying architecture.
Model tampering — interfering with an AI system during development to alter its deployed behaviour — represents a more sophisticated threat applicable to any machine learning model. Researchers have demonstrated that AI systems can be trained to harbour hidden objectives or "backdoors" — triggers that cause specific behaviours under certain conditions. The feasibility of tampering in real-world deployments has not been established at scale, but the theoretical risk is that a small group could gain covert influence over the behaviour of widely deployed AI models.
Supply chain compromise involves manipulating AI components — model weights, training data, software libraries, or hardware — before deployment. Given the concentration of AI development among a small number of providers and the complexity of AI supply chains, a single compromised component could affect many downstream systems. This risk applies to any Canadian deployment that relies on third-party AI models or components, which includes most government AI systems.
These threats are particularly significant in Canada because AI systems are already deployed in consequential government functions. IRCC uses AI for immigration application triage; CBSA uses AI for border risk scoring. These existing deployments are susceptible to data poisoning and supply chain compromise regardless of their architecture. As federal departments increasingly adopt LLM-based tools and AI agents, prompt injection becomes an additional and growing attack vector. The TBS Directive on Automated Decision-Making governs AI use across federal departments but does not require adversarial security testing. If any of these systems were compromised, the consequences could affect the rights and entitlements of large numbers of Canadians.
As AI systems take on more autonomous roles — processing sensitive data, making or recommending decisions, and interacting with other systems — the consequences of successful attacks grow. An AI agent compromised through prompt injection that is embedded in an organization's cyber defences could leave that organization vulnerable to further attacks. An AI system used for healthcare triage that has been subject to data poisoning could systematically misclassify patient risk levels.
Harms
Prompt injection attacks can hijack LLM-based systems and AI agents by embedding malicious instructions in external content. Agents that browse the web, process documents, or read emails can be redirected to exfiltrate data or take unauthorized actions without the user's knowledge.
Data poisoning and model manipulation attacks can corrupt AI systems during training or fine-tuning, causing models to produce biased or harmful outputs in targeted contexts while appearing to function normally otherwise.
Evidence
6 reports
- International AI Safety Report 2026 — Box 2.1: AI Systems as Targets, Box 2.4: Deliberate Attacks Primary source
Comprehensive evidence review of attacks on AI systems including prompt injection, data poisoning, model tampering, and supply chain compromise. Primary source for framing this hazard.
-
Foundational research demonstrating indirect prompt injection attacks against LLM-integrated applications, showing how malicious instructions in external content can hijack AI agents.
-
Research demonstrating that poisoning large-scale training datasets used by AI models is practically feasible, not just a theoretical concern.
-
Demonstration that AI models can be trained to harbour hidden behaviours (backdoors) that persist through standard safety training, showing feasibility of model tampering.
-
NIST risk management framework including evaluation of agent-hijacking risks, prompt injection, and other adversarial threats to AI systems.
-
Canadian cyber threat landscape assessment covering emerging AI-related threats including AI supply chain risks and adversarial attacks.
Record details
Policy Recommendationsassessed
Mandatory adversarial security evaluation of AI systems before deployment in government decision-making, covering prompt injection, data poisoning, and supply chain integrity
International AI Safety Report 2026 (Jun 1, 2026)Establish AI supply chain integrity standards for government procurement, requiring provenance verification for model weights, training data, and software dependencies
International AI Safety Report 2026 (Jun 1, 2026)Require ongoing monitoring for adversarial attacks on deployed AI systems in critical infrastructure and government services, with mandatory incident reporting
International AI Safety Report 2026 (Jun 1, 2026)Develop and adopt standards for AI agent communication protocols that include security properties (authentication, authorization, integrity) to prevent agent hijacking
International AI Safety Report 2026 (Jun 1, 2026)Editorial Assessment assessed
Canadian government agencies already use AI for immigration triage and border risk scoring — decisions that directly affect people's rights and entitlements. These systems, and the growing number of AI agents being deployed across government and critical infrastructure, are vulnerable to adversarial attacks that current security practices do not adequately address. A compromised AI system in government could systematically misdirect decisions affecting thousands of Canadians. No comprehensive AI adversarial security standard governs Canadian government AI deployments.
Entities Involved
AI Systems Involved
AI risk scoring system at Canadian borders; deployed in security-sensitive context and a potential target for adversarial attacks.
AI system used for immigration application triage; deployed in consequential government decision-making and a potential target for adversarial manipulation.
Related Records
- AI-Enhanced Cyberattacks Against Canadian Critical Infrastructurerelated
- IRCC Machine-Learning Triage Sorts Millions of Visa Applications Using Models Trained on Historical Decisionsrelated
- CBSA Machine Learning System Scores All Border Entrants with No Independent Auditrelated
- AI in Canadian Government Automated Decision-Makingrelated
Taxonomyassessed
Changelog
| Version | Date | Change |
|---|---|---|
| v1 | Mar 12, 2026 | Initial publication. Hazard identified through gap analysis against IASR 2026 Chapter 2 — attacks ON AI systems, distinct from existing hazard ai-enabled-cyberattacks-critical-infrastructure which covers attacks USING AI. |
| v2 | Mar 12, 2026 | Revised for precision: distinguished which attack vectors (prompt injection, data poisoning, model tampering, supply chain) apply to which system architectures (LLM-based vs traditional ML). Downgraded confidence from high to medium reflecting limited direct evidence of attacks on Canadian government AI systems. Completed FR narrative (added missing final paragraphs). Fixed Carlini et al. arXiv reference. Added CSE entity linkage and concentration_of_power systemic risk factor. |