AI Systems as Attack Surfaces — Canadian AI Incident Monitor

Escalating Severe Confidence: medium

AI systems deployed in Canadian government and critical infrastructure are targets for adversarial attacks — prompt injection, data poisoning, model tampering, supply chain compromise — that can manipulate their behaviour and compromise the decisions they support.

Identified: January 1, 2023 Last assessed: March 12, 2026

AI systems deployed in Canadian government, critical infrastructure, and commercial services are themselves targets for adversarial attacks. Unlike cyberattacks that use AI as a tool, this hazard concerns attacks directed at AI systems to manipulate their behaviour, extract sensitive information, or cause them to produce harmful outputs.

The attack surface of AI systems includes several distinct vectors, each affecting different system architectures:

Prompt injection is the most immediate and widely demonstrated threat to LLM-based systems and AI agents. Attackers embed malicious instructions in content that AI systems process — hidden text in websites, documents, or databases — causing the AI to act against the user's intentions. AI agents that browse the web, process emails, or access external databases are especially vulnerable because they encounter attacker-controlled content as a normal part of their operation. This vector is most relevant to the growing adoption of LLM-based tools across Canadian government for document processing, citizen services, and internal workflows. NIST has begun evaluating agent-hijacking risks through prompt injection.

Data poisoning involves corrupting the data that AI systems rely on, and threatens any machine learning system — including traditional classifiers, scoring models, and LLM-based systems alike. Poisoning can occur during initial training or during retrieval-augmented generation (RAG), where systems consult external databases to inform their responses. Poisoned data can introduce systematic biases, factual errors, or hidden behaviours that are difficult to detect and may affect all downstream users. Existing government AI systems such as IRCC's immigration triage and CBSA's border risk scoring are susceptible to this vector regardless of their underlying architecture.

Model tampering — interfering with an AI system during development to alter its deployed behaviour — represents a more sophisticated threat applicable to any machine learning model. Researchers have demonstrated that AI systems can be trained to harbour hidden objectives or "backdoors" — triggers that cause specific behaviours under certain conditions. The feasibility of tampering in real-world deployments has not been established at scale, but the theoretical risk is that a small group could gain covert influence over the behaviour of widely deployed AI models.

Supply chain compromise involves manipulating AI components — model weights, training data, software libraries, or hardware — before deployment. Given the concentration of AI development among a small number of providers and the complexity of AI supply chains, a single compromised component could affect many downstream systems. This risk applies to any Canadian deployment that relies on third-party AI models or components, which includes most government AI systems.

These threats are particularly significant in Canada because AI systems are already deployed in consequential government functions. IRCC uses AI for immigration application triage; CBSA uses AI for border risk scoring. These existing deployments are susceptible to data poisoning and supply chain compromise regardless of their architecture. As federal departments increasingly adopt LLM-based tools and AI agents, prompt injection becomes an additional and growing attack vector. The TBS Directive on Automated Decision-Making governs AI use across federal departments but does not require adversarial security testing. If any of these systems were compromised, the consequences could affect the rights and entitlements of large numbers of Canadians.

As AI systems take on more autonomous roles — processing sensitive data, making or recommending decisions, and interacting with other systems — the consequences of successful attacks grow. An AI agent compromised through prompt injection that is embedded in an organization's cyber defences could leave that organization vulnerable to further attacks. An AI system used for healthcare triage that has been subject to data poisoning could systematically misclassify patient risk levels.

Harms

Prompt injection attacks can hijack LLM-based systems and AI agents by embedding malicious instructions in external content. Agents that browse the web, process documents, or read emails can be redirected to exfiltrate data or take unauthorized actions without the user's knowledge.

Cyber IncidentPrivacy & Data ExposureSeverePopulation

Data poisoning and model manipulation attacks can corrupt AI systems during training or fine-tuning, causing models to produce biased or harmful outputs in targeted contexts while appearing to function normally otherwise.

Cyber IncidentSignificantSector

Evidence

6 reports

International AI Safety Report 2026 — Box 2.1: AI Systems as Targets, Box 2.4: Deliberate Attacks Primary source
Official — International AI Safety Report (Jun 1, 2026)
Comprehensive evidence review of attacks on AI systems including prompt injection, data poisoning, model tampering, and supply chain compromise. Primary source for framing this hazard.
Not what you have signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Academic — arXiv (Greshake et al.) (Feb 1, 2023)
Foundational research demonstrating indirect prompt injection attacks against LLM-integrated applications, showing how malicious instructions in external content can hijack AI agents.
Poisoning Web-Scale Training Datasets is Practical
Academic — arXiv (Carlini et al.) (Dec 1, 2023)
Research demonstrating that poisoning large-scale training datasets used by AI models is practically feasible, not just a theoretical concern.
Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training
Academic — arXiv (Anthropic) (Jan 1, 2024)
Demonstration that AI models can be trained to harbour hidden behaviours (backdoors) that persist through standard safety training, showing feasibility of model tampering.
NIST AI 600-1: AI Risk Management Framework — Generative AI Profile
Official — NIST (Jul 1, 2024)
NIST risk management framework including evaluation of agent-hijacking risks, prompt injection, and other adversarial threats to AI systems.
National Cyber Threat Assessment 2025-2026
Official — Canadian Centre for Cyber Security (Oct 1, 2024)
Canadian cyber threat landscape assessment covering emerging AI-related threats including AI supply chain risks and adversarial attacks.

Record details

Policy Recommendationsassessed

Mandatory adversarial security evaluation of AI systems before deployment in government decision-making, covering prompt injection, data poisoning, and supply chain integrity

International AI Safety Report 2026 (Jun 1, 2026)

Establish AI supply chain integrity standards for government procurement, requiring provenance verification for model weights, training data, and software dependencies

International AI Safety Report 2026 (Jun 1, 2026)

Require ongoing monitoring for adversarial attacks on deployed AI systems in critical infrastructure and government services, with mandatory incident reporting

International AI Safety Report 2026 (Jun 1, 2026)

Develop and adopt standards for AI agent communication protocols that include security properties (authentication, authorization, integrity) to prevent agent hijacking

International AI Safety Report 2026 (Jun 1, 2026)

Editorial Assessment assessed

Canadian government agencies already use AI for immigration triage and border risk scoring — decisions that directly affect people's rights and entitlements. These systems, and the growing number of AI agents being deployed across government and critical infrastructure, are vulnerable to adversarial attacks that current security practices do not adequately address. A compromised AI system in government could systematically misdirect decisions affecting thousands of Canadians. No comprehensive AI adversarial security standard governs Canadian government AI deployments.

Entities Involved

Canada Border Services Agency

deployer

Canadian Centre for Cyber Security

regulator

Communications Security Establishment

regulator

Immigration, Refugees and Citizenship Canada

deployer

Treasury Board of Canada Secretariat

regulator

AI Systems Involved

Traveller Compliance Indicator (TCI)

AI risk scoring system at Canadian borders; deployed in security-sensitive context and a potential target for adversarial attacks.

IRCC Advanced Analytics Triage System

AI system used for immigration application triage; deployed in consequential government decision-making and a potential target for adversarial manipulation.

Related Records

Taxonomyassessed

Domain

Public ServicesCritical InfrastructureDefence & SecurityImmigration

Harm type

Cyber IncidentPrivacy & Data ExposureDiscrimination & RightsService Disruption

AI pathway

Adversarial InputSupply Chain OriginSystem Integration ContextSafety Mechanism Ineffective

Lifecycle phase

DeploymentMonitoringIncident Response

Changelog

Changelog
Version	Date	Change
v1	Mar 12, 2026	Initial publication. Hazard identified through gap analysis against IASR 2026 Chapter 2 — attacks ON AI systems, distinct from existing hazard ai-enabled-cyberattacks-critical-infrastructure which covers attacks USING AI.
v2	Mar 12, 2026	Revised for precision: distinguished which attack vectors (prompt injection, data poisoning, model tampering, supply chain) apply to which system architectures (LLM-based vs traditional ML). Downgraded confidence from high to medium reflecting limited direct evidence of attacks on Canadian government AI systems. Completed FR narrative (added missing final paragraphs). Fixed Carlini et al. arXiv reference. Added CSE entity linkage and concentration_of_power systemic risk factor.

Version 2

← All hazards