Methodology
Scope
CAIM's subject is AI hazards — conditions in which AI systems create a realistic pathway to harm. The monitor documents these hazards through two types of records:
- Incident records document discrete events where an AI hazard produced harm or near-harm. An incident is direct evidence that a hazard is real.
- Hazard records document structural conditions that create realistic pathways to harm — whether or not harm has already occurred. A hazard record makes the underlying risk visible, not just its consequences.
Hazards and incidents are different kinds of things: a hazard is an ongoing condition; an incident is a discrete event. A single hazard can produce many incidents over time, and the hazard persists even after incidents occur — just as a dangerous intersection remains a hazard after each collision. The materialized_from link on incident records captures this evidentiary relationship. This follows the model used in aviation safety, where crash investigations and voluntary hazard reports feed the same safety objective.
What systems are in scope
An AI system is one that uses machine learning, neural networks, foundation models, or systems built on these. This includes statistical models trained on data, deep learning, generative models, and hybrid systems incorporating such components.
Systems whose behavior is fully specified by human-authored rules — deterministic scoring instruments, structured questionnaires, rule-based automation, data extraction tools — are out of scope, even when described as "algorithmic" or "AI."
This definition will be revised as the technology evolves.
What is out of scope
- Rule-based systems, deterministic scoring instruments, structured questionnaires, and data extraction tools — even when deployed at scale in consequential decisions
- Simple automation (e.g., mail merge, spreadsheet macros, basic workflow triggers) where the system has no decision-making function and no plausible pathway to harm
- Purely theoretical risks with no documented evidence of a precursor condition
- Events where AI is mentioned incidentally but played no material role in the pathway to harm
Incidents
An event or series of events in which an AI system's development, deployment, or use is plausibly implicated in harm or a near-harm outcome. This includes materially AI-enabled misuse.
Hazards
A credible risk condition, precursor failure, or near-miss pattern indicating a realistic pathway to harm, even if harm was prevented or has not yet been observed. Hazards are included because near-misses are often the most informative cases for prevention.
Hazard records require documented evidence of the precursor condition — a regulatory finding, an investigation, a published technical assessment, or equivalent. A policy gap alone is not sufficient; there must be evidence that the gap has created conditions where harm is plausible and proximate.
Material AI involvement
A case is in scope when an AI system is a meaningful factor in the pathway to harm, not merely incidental. The test is whether the AI system's behaviour, design, deployment, or governance meaningfully shaped the outcome.
Borderline cases — worked examples:
| Scenario | In scope? | Reasoning |
|---|---|---|
| AI-generated deepfake used to impersonate a CEO and authorize a fraudulent wire transfer | Yes | AI capability (voice cloning / image synthesis) is the enabling factor — the fraud could not have occurred at this fidelity without it |
| A phishing email written with ChatGPT | Generally no | AI improved the email's grammar but a human conceived and executed the fraud. AI is incidental to the pathway to harm |
| A hospital deploys an AI triage tool that delays care for a patient who is later harmed | Yes | The AI system's classification directly influenced the clinical decision pathway |
| A hospital's electronic health record system crashes, delaying care | No | Software failure, but no AI or automated decision-making component in the pathway to harm |
| An employer uses an AI resume screener that systematically disadvantages candidates with disabilities | Yes | The AI system's learned biases are the mechanism of discrimination |
| An employer's HR department applies a manual policy that disadvantages candidates with disabilities | No | Discrimination occurred but no AI or automated decision-making system was involved |
| A government agency uses a fixed-score questionnaire to assess risk, and the tool produces biased outcomes | No | A deterministic scoring instrument with human-authored rules. The harm is real, but the system is not AI — its behavior is fully specified by its design |
Canada nexus
A case has a Canada nexus if one or more of the following applies:
- The incident occurred in Canada
- People or institutions in Canada were affected
- A Canadian organization developed, deployed, operated, or materially enabled the system
- There was material Canadian impact (economic, safety, rights, infrastructure, or governance)
- The case has direct regulatory or policy relevance to Canadian jurisdictions
- An international event with documented implications for Canadian systems, populations, or governance
Severity calibration
CAIM uses an ordinal severity scale. To ensure consistency across editors and over time, each level is anchored with operational criteria and reference examples.
| Level | Criteria | Reference examples |
|---|---|---|
| Minor | Limited, easily reversible harm affecting a small number of individuals. No lasting consequences. Quickly corrected. | A chatbot gives incorrect but non-dangerous information; a recommendation system briefly shows irrelevant results |
| Moderate | Meaningful harm that is recoverable but required effort to correct. Affected a defined group or created measurable costs. | AI hiring tool screens out qualified candidates from a batch; autonomous vehicle testing proceeds without comprehensive safety framework |
| Significant | Substantial harm that is difficult to reverse. Affected a large group, created systemic risks, or triggered regulatory intervention. | Facial recognition deployed covertly at population scale; AI-generated deepfake disinformation targets election integrity |
| Severe | Serious harm to many individuals or institutions. Documented financial, psychological, or rights impacts at scale. Required major institutional response. | AI-generated CSAM at volume requiring law enforcement response; AI chatbot failures causing documented psychological harm at scale |
| Critical | Widespread, potentially irreversible harm. Loss of life, large-scale rights violations, or systemic institutional failure. | Autonomous weapons causing civilian casualties; AI system failure causing critical infrastructure collapse (no Canadian examples to date) |
When severity is uncertain, records use "unknown" rather than guessing. Severity may be upgraded or downgraded as new information emerges; all changes are documented in the changelog.
Reach
Reach describes the scale of people or entities affected. Like severity, it uses an ordinal scale with "unknown" permitted.
| Level | Criteria | Reference examples |
|---|---|---|
| Individual | One or a small number of identified individuals directly affected. | A single person denied a benefit by an automated system; a chatbot giving harmful advice to one user |
| Group | A defined group of people affected — typically dozens to hundreds sharing a common characteristic or context. | Applicants screened out by a biased hiring tool in a single recruitment round; patients at one hospital affected by a diagnostic AI error |
| Organization | An entire organization's operations or workforce materially affected. | A company's AI system breach exposing all employee data; an agency's automated workflow failing department-wide |
| Sector | Systemic impact across an industry or government sector, affecting multiple organizations or the sector's operational norms. | AI hiring tools creating sector-wide discrimination patterns; regulatory gaps affecting all health AI deployments nationally |
| Population | Society-wide or affecting a large portion of the Canadian population — or creating conditions that could. | Mass AI surveillance without legal authority; AI-generated election disinformation at national scale |
The pipeline
CAIM operates through a structured pipeline with six stages.
1. Intake
Reports enter through three channels:
- Public sources: media reporting, official documents, court records, regulator notices, vendor disclosures, and academic publications.
- Structured submissions: organizations or individuals submit details through a form covering timeline, impact, AI system context, and mitigations attempted.
- Confidential channel: for sensitive cases requiring redaction, source protection, or coordinated disclosure.
2. Triage
Each report is assessed for:
- Scope: Does it involve material AI involvement and a Canada nexus?
- Classification: Is it best treated as an incident or a hazard?
- Verification path: What sources are available, and what verification status is appropriate?
- Sensitivity: Does the report contain security-sensitive or privacy-sensitive details requiring special handling?
- De-duplication: Does this relate to an existing record?
3. Documentation
The editorial team produces a compact, factual record including:
- A short neutral narrative of what happened or what the hazard entails
- Key dates and jurisdiction(s), including the level of government with regulatory authority
- Affected domain and stakeholders (classes of people, not identities)
- AI system context, to the extent it can be responsibly described
- Observed harms or near-harms
- A transparent source list
- Structured taxonomy tags
- A mitigation note: 3-6 controls that would plausibly reduce likelihood or impact, tied to the specific pathway to harm
4. Review
A verification editor evaluates source quality and factual accuracy. A safety reviewer handles redaction decisions and coordinated disclosure for security-sensitive cases. No record is published without both editorial and safety review.
5. Publication
Records are published with a verification label, sources, taxonomy tags, and a version number. Version 1 is archived. All subsequent changes produce new versions with a visible changelog.
Where responsible publication requires delay — for example, while a vulnerability is being addressed — CAIM withholds the record until publication is safe, and publishes high-level defensive guidance in the interim where possible.
6. Corrections and synthesis
Corrections: If a record is materially inaccurate, it is corrected promptly and transparently. If core claims cannot be supported, records may be retracted with an explanation and preserved tombstone metadata. Appeals focus on factual accuracy and responsible publication.
Synthesis: CAIM periodically reviews accumulated records to produce trend briefings and update the mitigation library. This is where case-level documentation becomes institutional learning.
The record format
Every published record has three layers.
Narrative layer
A human-readable, compact account:
- What happened (or almost happened)
- Key dates and jurisdiction(s)
- Who was affected (classes of stakeholders)
- AI system context (what can be responsibly supported)
- Observed harms or near-harms
- What is known vs. alleged vs. uncertain
Evidence layer
Transparent sourcing:
- Source list with dates and type (media, official, court, disclosure, academic)
- Where useful, a claims table mapping specific claims to supporting sources and confidence notes
Structure layer
Taxonomy tags enabling search, filtering, and analysis:
- Domain: finance, health, public services, education, critical infrastructure, elections/information integrity, etc.
- Harm type: fraud/impersonation, privacy/data exposure, discrimination/rights impacts, safety failure, cyber incident, misinformation, operational failure, etc.
- AI involvement type: development flaw, deployment failure, misuse, supply-chain/tooling, human oversight breakdown, monitoring gap, etc.
- Lifecycle phase: design, training, evaluation, deployment, monitoring, incident response
- Severity and reach: ordinal scales calibrated with reference examples (see above); explicit "unknown" permitted
- Jurisdiction level: federal, provincial/territorial, municipal, or multi-level — identifying which level of government has primary regulatory authority over the system or domain
- Canada nexus basis: which nexus criteria are met
Mitigation note
Each record includes 3-6 controls that would plausibly have reduced the likelihood or impact of the event, tied to what went wrong in that specific case.
Verification ladder
Records carry a verification status so readers can always assess how certain the information is.
| Status | Meaning |
|---|---|
| Reported | Credible initial reporting; claims not yet independently corroborated |
| Corroborated | Supported by multiple independent credible sources |
| Confirmed | Supported by primary documentation or exceptionally strong corroboration |
| Contested | Credible dispute exists about core claims |
| Retracted | Core claims cannot be supported; withdrawn with explanation |
CAIM distinguishes between what is known, what is alleged, and what remains uncertain. Where information is incomplete, CAIM publishes what is supportable and explicitly marks what is unknown.
Verification of hazard records
For incident records, the verification ladder assesses how well-established the facts of the event are. For hazard records, it assesses how well-documented the precursor condition is:
- Reported: A credible source has identified the risk condition, but it has not been independently examined — e.g., a media report describing a regulatory gap.
- Corroborated: Multiple independent credible sources document the condition — e.g., both a regulatory body and independent researchers have identified the same gap or precursor failure.
- Confirmed: A primary authority has formally documented the condition — e.g., an official investigation, audit, or regulatory finding establishes the precursor condition as fact.
The verification status reflects the strength of evidence for the precursor condition itself, not a prediction of future harm. The risk assessment — how plausible the pathway to harm is and how severe the consequences could be — is a separate editorial judgment, stated transparently in the narrative, grounded in evidence where possible, and revisable as new information emerges. A hazard can be "confirmed" (the underlying condition is well-documented) while the severity of the risk remains uncertain or contested.
Taxonomy
CAIM's taxonomy is designed to be stable, interpretable, and interoperable. Records are coded along the dimensions described above (domain, harm type, AI involvement type, lifecycle phase, severity, jurisdiction level, nexus basis). The taxonomy is published and versioned; changes are documented.
Where feasible, CAIM aligns its fields with international incident-reporting frameworks — particularly the OECD AI Incidents Monitor and the AI Incident Database (AIID) — to support comparability and institutional adoption.
Data model
CAIM's data model separates observations (base-level) from classification (taxonomy layers). A record is publishable with no taxonomy applied. All classification can evolve independently of the underlying observations.
Entity role primitives
Every entity referenced on a record carries one or more role primitives — a small, durable set of organizational relationships:
- Developer — built, trained, or created the AI system
- Deployer — put the system into operational use
- Regulator — investigated, audited, or issued findings
- Affected party — experienced harm or was subject to the system's decisions
- Reporter — disclosed, documented, or reported the incident or hazard
These enable structured queries — "all deployers," "all incidents with regulator involvement" — without relying on taxonomy. Entity pages aggregate records grouped by role.
Response and outcome tracking
Records track the governance feedback loop: what was done in response, by whom, and what resulted. Each response entry includes the actor (linked to an entity), date, action taken, and outcome. On incidents, this tracks investigation, enforcement, policy change, and litigation. On hazards, it tracks governance attention — reports published, consultations launched, legislation introduced.
This makes CAIM useful for policy analysis: which incidents led to regulatory action? What's the response rate? Which sectors have governance gaps?
Assessment history
Hazards carry a time-ordered history of assessments, rather than a single snapshot. Each assessment records the date, status (active, escalating, mitigated, retired), confidence level, potential severity, potential reach, and evidence summary. The current assessment is always the most recent entry.
This enables temporal analysis: how did this hazard evolve? Which hazards escalated? How quickly are hazards being addressed? Whether a hazard has produced incidents is captured separately through the materialized_from links on incident records — a hazard's assessment status describes the state of the underlying condition, not whether incidents have occurred.
One-sided links
All relationships are declared on one side only. When an incident is linked to a hazard, the materialized_from reference is declared on the incident. The build step computes reverse lookups — the hazard page shows its linked incidents without storing them. This eliminates consistency rot as the record corpus grows.
Build-time integrity
The build step validates the entire record graph: slug references, taxonomy values, bilingual parity, assessment ordering, and relationship consistency. Broken references are build errors. Missing translations are warnings. No record can reference a nonexistent entity, system, or record.
Systemic risk analysis
CAIM's most distinctive analytical contribution is a methodology for connecting deployment-level incident patterns to catastrophic risk trajectories. This is the bridging analysis.
Systemic risk factors
Every record is tagged with zero or more systemic risk factors — structural properties of the failure that are relevant across risk scales. These are the dimensions that connect what happened in a specific Canadian AI deployment to what could happen at higher capability levels:
| Factor | What it reveals |
|---|---|
| Loss of human control | System operated beyond human oversight capacity |
| Unexpected capability | System demonstrated behaviour outside design expectations |
| Resistance to correction | Institutional or technical barriers made correction difficult |
| Opacity | Decision process not interpretable by affected parties or overseers |
| Autonomous scope expansion | System's influence expanded beyond intended boundaries |
| Cascade propagation | Failure triggered failures elsewhere |
| Governance gap | No mechanism existed to prevent, detect, or respond |
| Accountability void | No entity bore clear responsibility |
| Concentration of power | Incident reflected or increased power asymmetry |
| Epistemic degradation | Incident undermined collective ability to assess truth or risk |
The editorial question for each factor is: "Does this record demonstrate this structural property?" — not "Is this the root cause?" Systemic risk factors describe what the record reveals about structural conditions.
Escalation model
Every hazard includes an escalation model — the bridging analysis for that specific hazard:
- Governance dependencies — institutional capacities that must exist to prevent escalation (e.g., "mandatory pre-deployment impact assessment," "independent audit authority")
- Catastrophic bridge — a narrative connecting the hazard to catastrophic risk trajectories: how the same structural properties that cause harm at current capability levels enable catastrophic outcomes at higher levels
- Precursor signals — observable patterns indicating the hazard may be escalating
- Bridge confidence — how strong the connection is (low, medium, high)
Cross-record computations
At build time, CAIM computes aggregate patterns across the entire record corpus:
- Risk factor co-occurrence — which structural failure properties cluster together (e.g., governance_gap + opacity co-occur in the majority of records)
- Governance dependency patterns — which institutional capacities are most frequently absent, ranked by how many domains they span
- Cross-domain patterns — factors appearing in 3+ domains, indicating structural properties that transcend sector-specific governance
- Escalation velocity — how quickly hazards move through status transitions
These computations are exposed through the systemic analysis API endpoint.
Interoperability
OECD alignment
CAIM maintains two classification layers on every record: a CAIM native taxonomy (primary, richer, optimized for Canadian policy users) and an OECD AIM interoperability layer (optional, populated during editorial tagging). The two layers coexist without flattening — neither is redundant. This follows the pattern used in aviation safety, where national authorities maintain detailed classification systems while mapping to international codes for reporting.
Data exports include an OECD-compatible view that maps CAIM fields to the OECD schema. CAIM's editorial metadata (verification ladder, versioning, redaction flags, bilingual labels) is preserved in an extension namespace.
AIID alignment
CAIM adopts the AIID's conceptual split between incidents (canonical events) and reports (individual source documents). Records include optional AIID cross-reference identifiers where matches exist. CAIM's taxonomy provides a crosswalk to AIID taxonomy sets, with local tags (bilingual labels, Canada nexus, editorial metadata) maintained separately.
API
CAIM provides machine-readable access to the full record corpus, aggregate statistics, systemic risk analysis, taxonomy definitions, and a JSON Feed. All endpoints are static JSON, CORS-enabled, with no authentication required.
For full endpoint documentation, see the API reference.
Privacy and responsible publication
CAIM follows strict safeguards:
- Personal data about victims is redacted or excluded
- Cases involving minors receive heightened protection
- Records avoid doxxing, avoid reproducing harmful content unnecessarily, and use victim-centered language
- For security-sensitive cases, CAIM follows coordinated disclosure norms: it prioritizes mitigation and safety, publishes high-level learning and defensive guidance, and withholds enabling details until risk is reduced
- CAIM does not become a platform for harassment or reputational attacks; records are sourced, cautious in language, and focused on what happened and what can be learned