Methodology — Canadian AI Incident Monitor

CAIM documents AI incidents and hazards with a Canada nexus through a six-stage pipeline — from intake to publication and synthesis. Every record carries a five-level verification status, pathway-specific mitigations, and taxonomy tags aligned with OECD and AIID frameworks for international interoperability.

Scope

CAIM's subject is AI hazards — conditions in which AI systems create a realistic pathway to harm. The monitor documents these hazards through two types of records:

Incident records document discrete events where an AI hazard produced harm or near-harm. An incident is direct evidence that a hazard is real.
Hazard records document structural conditions that create realistic pathways to harm — whether or not harm has already occurred. A hazard record makes the underlying risk visible, not just its consequences.

Hazards and incidents are different kinds of things: a hazard is an ongoing condition; an incident is a discrete event. A single hazard can produce many incidents over time, and the hazard persists even after incidents occur — just as a dangerous intersection remains a hazard after each collision. The materialized_from link on incident records captures this evidentiary relationship. This follows the model used in aviation safety, where crash investigations and voluntary hazard reports feed the same safety objective.

What systems are in scope

An AI system is one that uses machine learning, neural networks, foundation models, or systems built on these. This includes statistical models trained on data, deep learning, generative models, and hybrid systems incorporating such components.

Systems whose behavior is fully specified by human-authored rules — deterministic scoring instruments, structured questionnaires, rule-based automation, data extraction tools — are out of scope, even when described as "algorithmic" or "AI."

This definition will be revised as the technology evolves.

What is out of scope

Rule-based systems, deterministic scoring instruments, structured questionnaires, and data extraction tools — even when deployed at scale in consequential decisions
Simple automation (e.g., mail merge, spreadsheet macros, basic workflow triggers) where the system has no decision-making function and no plausible pathway to harm
Purely theoretical risks with no documented evidence of a precursor condition
Events where AI is mentioned incidentally but played no material role in the pathway to harm

Incidents

An event or series of events in which an AI system's development, deployment, or use is plausibly implicated in harm or a near-harm outcome. This includes materially AI-enabled misuse.

Hazards

A credible risk condition, precursor failure, or near-miss pattern indicating a realistic pathway to harm, even if harm was prevented or has not yet been observed. Hazards are included because near-misses are often the most informative cases for prevention.

Hazard records require documented evidence of the precursor condition — a regulatory finding, an investigation, a published technical assessment, or equivalent. A policy gap alone is not sufficient; there must be evidence that the gap has created conditions where harm is plausible and proximate.

Material AI involvement

A case is in scope when an AI system is a meaningful factor in the pathway to harm, not merely incidental. The test is whether the AI system's behaviour, design, deployment, or governance meaningfully shaped the outcome.

Borderline cases — worked examples:

Scenario	In scope?	Reasoning
AI-generated deepfake used to impersonate a CEO and authorize a fraudulent wire transfer	Yes	AI capability (voice cloning / image synthesis) is the enabling factor — the fraud could not have occurred at this fidelity without it
A phishing email written with ChatGPT	Generally no	AI improved the email's grammar but a human conceived and executed the fraud. AI is incidental to the pathway to harm
A hospital deploys an AI triage tool that delays care for a patient who is later harmed	Yes	The AI system's classification directly influenced the clinical decision pathway
A hospital's electronic health record system crashes, delaying care	No	Software failure, but no AI or automated decision-making component in the pathway to harm
An employer uses an AI resume screener that systematically disadvantages candidates with disabilities	Yes	The AI system's learned biases are the mechanism of discrimination
An employer's HR department applies a manual policy that disadvantages candidates with disabilities	No	Discrimination occurred but no AI or automated decision-making system was involved
A government agency uses a fixed-score questionnaire to assess risk, and the tool produces biased outcomes	No	A deterministic scoring instrument with human-authored rules. The harm is real, but the system is not AI — its behavior is fully specified by its design

Canada nexus

A case has a Canada nexus if one or more of the following applies:

The incident occurred in Canada
People or institutions in Canada were affected
A Canadian organization developed, deployed, operated, or materially enabled the system
There was material Canadian impact (economic, safety, rights, infrastructure, or governance)
The case has direct regulatory or policy relevance to Canadian jurisdictions
An international event with documented implications for Canadian systems, populations, or governance

Severity calibration

CAIM uses an ordinal severity scale. To ensure consistency across editors and over time, each level is anchored with operational criteria and reference examples.

Level	Criteria	Reference examples
Minor	Limited, easily reversible harm affecting a small number of individuals. No lasting consequences. Quickly corrected.	A chatbot gives incorrect but non-dangerous information; a recommendation system briefly shows irrelevant results
Moderate	Meaningful harm that is recoverable but required effort to correct. Affected a defined group or created measurable costs.	AI hiring tool screens out qualified candidates from a batch; autonomous vehicle testing proceeds without comprehensive safety framework
Significant	Substantial harm that is difficult to reverse. Affected a large group, created systemic risks, or triggered regulatory intervention.	Facial recognition deployed covertly at population scale; AI-generated deepfake disinformation targets election integrity
Severe	Serious harm to many individuals or institutions. Documented financial, psychological, or rights impacts at scale. Required major institutional response.	AI-generated CSAM at volume requiring law enforcement response; AI chatbot failures causing documented psychological harm at scale
Critical	Widespread, potentially irreversible harm. Loss of life, large-scale rights violations, or systemic institutional failure.	Autonomous weapons causing civilian casualties; AI system failure causing critical infrastructure collapse (no Canadian examples to date)

When severity is uncertain, records use "unknown" rather than guessing. Severity may be upgraded or downgraded as new information emerges; all changes are documented in the changelog.

Reach

Reach describes the scale of people or entities affected. Like severity, it uses an ordinal scale with "unknown" permitted.

Level	Criteria	Reference examples
Individual	One or a small number of identified individuals directly affected.	A single person denied a benefit by an automated system; a chatbot giving harmful advice to one user
Group	A defined group of people affected — typically dozens to hundreds sharing a common characteristic or context.	Applicants screened out by a biased hiring tool in a single recruitment round; patients at one hospital affected by a diagnostic AI error
Organization	An entire organization's operations or workforce materially affected.	A company's AI system breach exposing all employee data; an agency's automated workflow failing department-wide
Sector	Systemic impact across an industry or government sector, affecting multiple organizations or the sector's operational norms.	AI hiring tools creating sector-wide discrimination patterns; regulatory gaps affecting all health AI deployments nationally
Population	Society-wide or affecting a large portion of the Canadian population — or creating conditions that could.	Mass AI surveillance without legal authority; AI-generated election disinformation at national scale

The pipeline

CAIM operates through a structured pipeline with six stages.

1. Intake

Reports enter through three channels:

Public sources: media reporting, official documents, court records, regulator notices, vendor disclosures, and academic publications.
Structured submissions: organizations or individuals submit details through a form covering timeline, impact, AI system context, and mitigations attempted.
Confidential channel: for sensitive cases requiring redaction, source protection, or coordinated disclosure.

2. Triage

Each report is assessed for:

Scope: Does it involve material AI involvement and a Canada nexus?
Classification: Is it best treated as an incident or a hazard?
Verification path: What sources are available, and what verification status is appropriate?
Sensitivity: Does the report contain security-sensitive or privacy-sensitive details requiring special handling?
De-duplication: Does this relate to an existing record?

3. Documentation

The editorial team produces a compact, factual record including:

A short neutral narrative of what happened or what the hazard entails
Key dates and jurisdiction(s), including the level of government with regulatory authority
Affected domain and stakeholders (classes of people, not identities)
AI system context, to the extent it can be responsibly described
Observed harms or near-harms
A transparent source list
Structured taxonomy tags
A mitigation note: 3-6 controls that would plausibly reduce likelihood or impact, tied to the specific pathway to harm

4. Review

A verification editor evaluates source quality and factual accuracy. A safety reviewer handles redaction decisions and coordinated disclosure for security-sensitive cases. No record is published without both editorial and safety review.

5. Publication

Records are published with a verification label, sources, taxonomy tags, and a version number. Version 1 is archived. All subsequent changes produce new versions with a visible changelog.

Where responsible publication requires delay — for example, while a vulnerability is being addressed — CAIM withholds the record until publication is safe, and publishes high-level defensive guidance in the interim where possible.

6. Corrections and synthesis

Corrections: If a record is materially inaccurate, it is corrected promptly and transparently. If core claims cannot be supported, records may be retracted with an explanation and preserved tombstone metadata. Appeals focus on factual accuracy and responsible publication.

Synthesis: CAIM periodically reviews accumulated records to produce trend briefings and update the mitigation library. This is where case-level documentation becomes institutional learning.

The record format

Every published record has three layers.

Narrative layer

A human-readable, compact account:

What happened (or almost happened)
Key dates and jurisdiction(s)
Who was affected (classes of stakeholders)
AI system context (what can be responsibly supported)
Observed harms or near-harms
What is known vs. alleged vs. uncertain

Evidence layer

Transparent sourcing:

Source list with dates and type (media, official, court, disclosure, academic)
Where useful, a claims table mapping specific claims to supporting sources and confidence notes

Structure layer

Taxonomy tags enabling search, filtering, and analysis:

Domain: finance, health, public services, education, critical infrastructure, elections/information integrity, etc.
Harm type: fraud/impersonation, privacy/data exposure, discrimination/rights impacts, safety failure, cyber incident, misinformation, operational failure, etc.
AI involvement type: development flaw, deployment failure, misuse, supply-chain/tooling, human oversight breakdown, monitoring gap, etc.
Lifecycle phase: design, training, evaluation, deployment, monitoring, incident response
Severity and reach: ordinal scales calibrated with reference examples (see above); explicit "unknown" permitted
Jurisdiction level: federal, provincial/territorial, municipal, or multi-level — identifying which level of government has primary regulatory authority over the system or domain
Canada nexus basis: which nexus criteria are met

Mitigation note

Each record includes 3-6 controls that would plausibly have reduced the likelihood or impact of the event, tied to what went wrong in that specific case.

Verification ladder

Records carry a verification status so readers can always assess how certain the information is.

Status	Meaning
Reported	Credible initial reporting; claims not yet independently corroborated
Corroborated	Supported by multiple independent credible sources
Confirmed	Supported by primary documentation or exceptionally strong corroboration
Contested	Credible dispute exists about core claims
Retracted	Core claims cannot be supported; withdrawn with explanation

CAIM distinguishes between what is known, what is alleged, and what remains uncertain. Where information is incomplete, CAIM publishes what is supportable and explicitly marks what is unknown.

Verification of hazard records

For incident records, the verification ladder assesses how well-established the facts of the event are. For hazard records, it assesses how well-documented the precursor condition is:

Reported: A credible source has identified the risk condition, but it has not been independently examined — e.g., a media report describing a regulatory gap.
Corroborated: Multiple independent credible sources document the condition — e.g., both a regulatory body and independent researchers have identified the same gap or precursor failure.
Confirmed: A primary authority has formally documented the condition — e.g., an official investigation, audit, or regulatory finding establishes the precursor condition as fact.

The verification status reflects the strength of evidence for the precursor condition itself, not a prediction of future harm. The risk assessment — how plausible the pathway to harm is and how severe the consequences could be — is a separate editorial judgment, stated transparently in the narrative, grounded in evidence where possible, and revisable as new information emerges. A hazard can be "confirmed" (the underlying condition is well-documented) while the severity of the risk remains uncertain or contested.

Taxonomy

CAIM's taxonomy is designed to be stable, interpretable, and interoperable. Records are coded along the dimensions described above (domain, harm type, AI involvement type, lifecycle phase, severity, jurisdiction level, nexus basis). The taxonomy is published and versioned; changes are documented.

Where feasible, CAIM aligns its fields with international incident-reporting frameworks — particularly the OECD AI Incidents Monitor and the AI Incident Database (AIID) — to support comparability and institutional adoption.

Data model

CAIM's data model separates observations (base-level) from classification (taxonomy layers). A record is publishable with no taxonomy applied. All classification can evolve independently of the underlying observations.

Entity role primitives

Every entity referenced on a record carries one or more role primitives — a small, durable set of organizational relationships:

Developer — built, trained, or created the AI system
Deployer — put the system into operational use
Regulator — investigated, audited, or issued findings
Affected party — experienced harm or was subject to the system's decisions
Reporter — disclosed, documented, or reported the incident or hazard

These enable structured queries — "all deployers," "all incidents with regulator involvement" — without relying on taxonomy. Entity pages aggregate records grouped by role.

Response and outcome tracking

Records track the governance feedback loop: what was done in response, by whom, and what resulted. Each response entry includes the actor (linked to an entity), date, action taken, and outcome. On incidents, this tracks investigation, enforcement, policy change, and litigation. On hazards, it tracks governance attention — reports published, consultations launched, legislation introduced.

This makes CAIM useful for policy analysis: which incidents led to regulatory action? What's the response rate? Which sectors have governance gaps?

Assessment history

Hazards carry a time-ordered history of assessments, rather than a single snapshot. Each assessment records the date, status (active, escalating, mitigated, retired), confidence level, potential severity, potential reach, and evidence summary. The current assessment is always the most recent entry.

This enables temporal analysis: how did this hazard evolve? Which hazards escalated? How quickly are hazards being addressed? Whether a hazard has produced incidents is captured separately through the materialized_from links on incident records — a hazard's assessment status describes the state of the underlying condition, not whether incidents have occurred.

One-sided links

All relationships are declared on one side only. When an incident is linked to a hazard, the materialized_from reference is declared on the incident. The build step computes reverse lookups — the hazard page shows its linked incidents without storing them. This eliminates consistency rot as the record corpus grows.

Build-time integrity

The build step validates the entire record graph: slug references, taxonomy values, bilingual parity, assessment ordering, and relationship consistency. Broken references are build errors. Missing translations are warnings. No record can reference a nonexistent entity, system, or record.

Systemic risk analysis

CAIM's most distinctive analytical contribution is a methodology for connecting deployment-level incident patterns to catastrophic risk trajectories. This is the bridging analysis.

Systemic risk factors

Every record is tagged with zero or more systemic risk factors — structural properties of the failure that are relevant across risk scales. These are the dimensions that connect what happened in a specific Canadian AI deployment to what could happen at higher capability levels:

Factor	What it reveals
Loss of human control	System operated beyond human oversight capacity
Unexpected capability	System demonstrated behaviour outside design expectations
Resistance to correction	Institutional or technical barriers made correction difficult
Opacity	Decision process not interpretable by affected parties or overseers
Autonomous scope expansion	System's influence expanded beyond intended boundaries
Cascade propagation	Failure triggered failures elsewhere
Governance gap	No mechanism existed to prevent, detect, or respond
Accountability void	No entity bore clear responsibility
Concentration of power	Incident reflected or increased power asymmetry
Epistemic degradation	Incident undermined collective ability to assess truth or risk

The editorial question for each factor is: "Does this record demonstrate this structural property?" — not "Is this the root cause?" Systemic risk factors describe what the record reveals about structural conditions.

Escalation model

Every hazard includes an escalation model — the bridging analysis for that specific hazard:

Governance dependencies — institutional capacities that must exist to prevent escalation (e.g., "mandatory pre-deployment impact assessment," "independent audit authority")
Catastrophic bridge — a narrative connecting the hazard to catastrophic risk trajectories: how the same structural properties that cause harm at current capability levels enable catastrophic outcomes at higher levels
Precursor signals — observable patterns indicating the hazard may be escalating
Bridge confidence — how strong the connection is (low, medium, high)

Cross-record computations

At build time, CAIM computes aggregate patterns across the entire record corpus:

Risk factor co-occurrence — which structural failure properties cluster together (e.g., governance_gap + opacity co-occur in the majority of records)
Governance dependency patterns — which institutional capacities are most frequently absent, ranked by how many domains they span
Cross-domain patterns — factors appearing in 3+ domains, indicating structural properties that transcend sector-specific governance
Escalation velocity — how quickly hazards move through status transitions

These computations are exposed through the systemic analysis API endpoint.

Interoperability

OECD alignment

CAIM maintains two classification layers on every record: a CAIM native taxonomy (primary, richer, optimized for Canadian policy users) and an OECD AIM interoperability layer (optional, populated during editorial tagging). The two layers coexist without flattening — neither is redundant. This follows the pattern used in aviation safety, where national authorities maintain detailed classification systems while mapping to international codes for reporting.

Data exports include an OECD-compatible view that maps CAIM fields to the OECD schema. CAIM's editorial metadata (verification ladder, versioning, redaction flags, bilingual labels) is preserved in an extension namespace.

AIID alignment

CAIM adopts the AIID's conceptual split between incidents (canonical events) and reports (individual source documents). Records include optional AIID cross-reference identifiers where matches exist. CAIM's taxonomy provides a crosswalk to AIID taxonomy sets, with local tags (bilingual labels, Canada nexus, editorial metadata) maintained separately.

API

CAIM provides machine-readable access to the full record corpus, aggregate statistics, systemic risk analysis, taxonomy definitions, and a JSON Feed. All endpoints are static JSON, CORS-enabled, with no authentication required.

For full endpoint documentation, see the API reference.

Privacy and responsible publication

CAIM follows strict safeguards:

Personal data about victims is redacted or excluded
Cases involving minors receive heightened protection
Records avoid doxxing, avoid reproducing harmful content unnecessarily, and use victim-centered language
For security-sensitive cases, CAIM follows coordinated disclosure norms: it prioritizes mitigation and safety, publishes high-level learning and defensive guidance, and withholds enabling details until risk is reduced
CAIM does not become a platform for harassment or reputational attacks; records are sourced, cautious in language, and focused on what happened and what can be learned