Systèmes d'IA comme surfaces d'attaque — Moniteur canadien des incidents en IA

En escalade Grave Confiance: medium

Les systèmes d'IA déployés dans l'administration canadienne et les infrastructures essentielles sont des cibles d'attaques adversariales — injection de requêtes, empoisonnement de données, falsification de modèles, compromission de la chaîne d'approvisionnement — qui peuvent manipuler leur comportement et compromettre les décisions qu'ils soutiennent.

Identifié: 1 janvier 2023 Dernière évaluation: 12 mars 2026

AI systems deployed in Canadian government, critical infrastructure, and commercial services are themselves targets for adversarial attacks. Unlike cyberattacks that use AI as a tool, this hazard concerns attacks directed at AI systems to manipulate their behaviour, extract sensitive information, or cause them to produce harmful outputs.

The attack surface of AI systems includes several distinct vectors, each affecting different system architectures:

Prompt injection is the most immediate and widely demonstrated threat to LLM-based systems and AI agents. Attackers embed malicious instructions in content that AI systems process — hidden text in websites, documents, or databases — causing the AI to act against the user's intentions. AI agents that browse the web, process emails, or access external databases are especially vulnerable because they encounter attacker-controlled content as a normal part of their operation. This vector is most relevant to the growing adoption of LLM-based tools across Canadian government for document processing, citizen services, and internal workflows. NIST has begun evaluating agent-hijacking risks through prompt injection.

Data poisoning involves corrupting the data that AI systems rely on, and threatens any machine learning system — including traditional classifiers, scoring models, and LLM-based systems alike. Poisoning can occur during initial training or during retrieval-augmented generation (RAG), where systems consult external databases to inform their responses. Poisoned data can introduce systematic biases, factual errors, or hidden behaviours that are difficult to detect and may affect all downstream users. Existing government AI systems such as IRCC's immigration triage and CBSA's border risk scoring are susceptible to this vector regardless of their underlying architecture.

Model tampering — interfering with an AI system during development to alter its deployed behaviour — represents a more sophisticated threat applicable to any machine learning model. Researchers have demonstrated that AI systems can be trained to harbour hidden objectives or "backdoors" — triggers that cause specific behaviours under certain conditions. The feasibility of tampering in real-world deployments has not been established at scale, but the theoretical risk is that a small group could gain covert influence over the behaviour of widely deployed AI models.

Supply chain compromise involves manipulating AI components — model weights, training data, software libraries, or hardware — before deployment. Given the concentration of AI development among a small number of providers and the complexity of AI supply chains, a single compromised component could affect many downstream systems. This risk applies to any Canadian deployment that relies on third-party AI models or components, which includes most government AI systems.

These threats are particularly significant in Canada because AI systems are already deployed in consequential government functions. IRCC uses AI for immigration application triage; CBSA uses AI for border risk scoring. These existing deployments are susceptible to data poisoning and supply chain compromise regardless of their architecture. As federal departments increasingly adopt LLM-based tools and AI agents, prompt injection becomes an additional and growing attack vector. The TBS Directive on Automated Decision-Making governs AI use across federal departments but does not require adversarial security testing. If any of these systems were compromised, the consequences could affect the rights and entitlements of large numbers of Canadians.

As AI systems take on more autonomous roles — processing sensitive data, making or recommending decisions, and interacting with other systems — the consequences of successful attacks grow. An AI agent compromised through prompt injection that is embedded in an organization's cyber defences could leave that organization vulnerable to further attacks. An AI system used for healthcare triage that has been subject to data poisoning could systematically misclassify patient risk levels.

Préjudices

Les attaques par injection de prompt peuvent détourner les systèmes basés sur des LLM et les agents IA en intégrant des instructions malveillantes dans du contenu externe. Les agents qui naviguent sur le web, traitent des documents ou lisent des courriels peuvent être redirigés pour exfiltrer des données ou prendre des actions non autorisées.

Incident cyberVie privée et donnéesGravePopulation

Les attaques par empoisonnement de données et manipulation de modèles peuvent corrompre les systèmes d'IA pendant l'entraînement ou le réglage fin, causant des résultats biaisés ou nuisibles dans des contextes ciblés tout en semblant fonctionner normalement autrement.

Incident cyberImportantSecteur

Preuves

6 rapports

International AI Safety Report 2026 — Box 2.1: AI Systems as Targets, Box 2.4: Deliberate Attacks Source principale
Officiel — International AI Safety Report (1 juin 2026)
Comprehensive evidence review of attacks on AI systems including prompt injection, data poisoning, model tampering, and supply chain compromise. Primary source for framing this hazard.
Not what you have signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Académique — arXiv (Greshake et al.) (1 févr. 2023)
Foundational research demonstrating indirect prompt injection attacks against LLM-integrated applications, showing how malicious instructions in external content can hijack AI agents.
Poisoning Web-Scale Training Datasets is Practical
Académique — arXiv (Carlini et al.) (1 déc. 2023)
Research demonstrating that poisoning large-scale training datasets used by AI models is practically feasible, not just a theoretical concern.
Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training
Académique — arXiv (Anthropic) (1 janv. 2024)
Demonstration that AI models can be trained to harbour hidden behaviours (backdoors) that persist through standard safety training, showing feasibility of model tampering.
NIST AI 600-1: AI Risk Management Framework — Generative AI Profile
Officiel — NIST (1 juill. 2024)
NIST risk management framework including evaluation of agent-hijacking risks, prompt injection, and other adversarial threats to AI systems.
National Cyber Threat Assessment 2025-2026
Officiel — Canadian Centre for Cyber Security (1 oct. 2024)
Canadian cyber threat landscape assessment covering emerging AI-related threats including AI supply chain risks and adversarial attacks.

Détails de la fiche

Recommandations de politiqueévalué

Mandatory adversarial security evaluation of AI systems before deployment in government decision-making, covering prompt injection, data poisoning, and supply chain integrity

International AI Safety Report 2026 (1 juin 2026)

Establish AI supply chain integrity standards for government procurement, requiring provenance verification for model weights, training data, and software dependencies

International AI Safety Report 2026 (1 juin 2026)

Require ongoing monitoring for adversarial attacks on deployed AI systems in critical infrastructure and government services, with mandatory incident reporting

International AI Safety Report 2026 (1 juin 2026)

Develop and adopt standards for AI agent communication protocols that include security properties (authentication, authorization, integrity) to prevent agent hijacking

International AI Safety Report 2026 (1 juin 2026)

Évaluation éditoriale évalué

Les agences gouvernementales canadiennes utilisent déjà l'IA pour le triage de l'immigration et l'évaluation des risques aux frontières — des décisions qui affectent directement les droits et prestations des personnes. Ces systèmes sont vulnérables aux attaques adversariales que les pratiques de sécurité actuelles ne traitent pas adéquatement. Un système d'IA compromis dans le gouvernement pourrait systématiquement fausser des décisions affectant des milliers de Canadiens.

Entités impliquées

Agence des services frontaliers du Canada

deployer

Centre canadien pour la cybersécurité

regulator

Centre de la sécurité des télécommunications

regulator

Immigration, Réfugiés et Citoyenneté Canada

deployer

Secrétariat du Conseil du Trésor du Canada

regulator

Systèmes d'IA impliqués

Traveller Compliance Indicator (TCI)

Système d'évaluation des risques IA aux frontières canadiennes; déployé dans un contexte sensible à la sécurité et cible potentielle d'attaques adversariales.

IRCC Advanced Analytics Triage System

Système d'IA utilisé pour le triage des demandes d'immigration; déployé dans des prises de décision gouvernementales conséquentes et cible potentielle de manipulation adversariale.

Fiches connexes

Taxonomieévalué

Domaine

Services publicsInfrastructures essentiellesDéfense et sécuritéImmigration

Type de préjudice

Incident cyberVie privée et donnéesDiscrimination et droitsInterruption de service

Voie de contribution de l'IA

Entrée adversarialeOrigine de la chaîne d'approvisionnementContexte d'intégration systèmeMécanisme de sécurité inefficace

Phase du cycle de vie

DéploiementSurveillanceRéponse aux incidents

Historique des modifications

Historique des modifications
Version	Date	Modification
v1	12 mars 2026	Initial publication. Hazard identified through gap analysis against IASR 2026 Chapter 2 — attacks ON AI systems, distinct from existing hazard ai-enabled-cyberattacks-critical-infrastructure which covers attacks USING AI.
v2	12 mars 2026	Revised for precision: distinguished which attack vectors (prompt injection, data poisoning, model tampering, supply chain) apply to which system architectures (LLM-based vs traditional ML). Downgraded confidence from high to medium reflecting limited direct evidence of attacks on Canadian government AI systems. Completed FR narrative (added missing final paragraphs). Fixed Carlini et al. arXiv reference. Added CSE entity linkage and concentration_of_power systemic risk factor.

Version 2

← Tous les risques