The AI Incident You Have Not Had Yet: Building a Causal Review Process Before You Need It

Every organisation with AI systems in production will, at some point, have an AI incident — an output that causes customer harm, a failure that attracts regulatory attention, a deployment that produces results the governance structure was not designed to catch.

Most organisations respond to that incident by building a review process in the immediate aftermath. The review is typically framed as: “what happened, who was responsible, and what are we doing to prevent it happening again.” The findings go into a remediation plan. The remediation plan goes into a board paper. The board acknowledges it. The governance is documented.

This process is better than nothing. It is substantially worse than building the review process before the incident occurs, for reasons that are structural rather than procedural.

Article illustration — ai-incident-causal-review-process


Why post-incident review processes produce weaker governance

A review process built in the aftermath of an incident has three structural limitations that a pre-incident process does not.

Limitation 1: The evidence is contaminated. When an incident occurs and a review is commissioned, the people who were involved in the decisions that led to the incident are aware of the review. Their recollections of what the governance process looked like are inevitably influenced by knowledge of the outcome. The review finds the governance failure, but the picture of the governance as it actually operated — before the incident — is obscured by post-hoc reconstruction.

A pre-existing causal review process has documentation of the governance state before the incident: the approval records, the oversight reports, the escalation logs, the monitoring outputs. When the incident occurs, the evidence base is intact and uncontaminated.

Limitation 2: The review is shaped by the finding. A post-incident review typically begins with the incident and works backwards to find the governance failure. This means the review scope is determined by the visible failure, not by the full range of governance gaps that may exist. A review that finds “the oversight mechanism for Model X was inadequate” typically does not also find “the oversight mechanism for Models Y and Z has the same gap” — because the review scope was defined by the incident, not by a systematic governance assessment.

Limitation 3: The governance fix is reactive, not preventive. A remediation plan built after an incident is, by definition, fixing the governance structure that failed. It is not building the governance structure that prevents the next failure mode, which may be in a different part of the AI deployment architecture and may not be visible until the next incident.


What a pre-incident causal review process looks like

A causal review process built before an incident serves a different function from a post-incident investigation. It is a standing mechanism that operates continuously and surfaces governance gaps before they become incidents.

The four components are:

Component 1: A standing incident register.

Every unexpected AI output above a defined materiality threshold is logged, categorised, and reviewed. Not just the failures that have consequences — all outputs that fall outside the model’s documented expected range. The register is maintained by the AI operations team and reviewed quarterly by the executive AI governance owner. Anything above a higher materiality threshold is escalated to the board in the next reporting cycle.

The governance value of the register is not the incidents it captures. It is the patterns it reveals. A model that produces anomalous outputs at a frequency of one per month may or may not be cause for concern. A model that produces anomalous outputs at increasing frequency over three months is a leading indicator of a governance problem — data drift, model degradation, deployment scope creep — that a standing register will surface before it becomes an incident.

Component 2: A documented causal model for each material AI deployment.

Before a deployment goes to board approval, the executive team documents the causal model: what inputs produce what outputs under what conditions, what the model’s validated operating range is, what categories of failure are known and expected, and what conditions would indicate the deployment is operating outside its validated range.

The causal model serves two purposes. It gives the oversight mechanism something specific to monitor against. And when a review is conducted — before or after an incident — it provides the baseline against which actual outputs can be assessed. Without a documented causal model, a review can only describe what happened. With one, it can determine whether what happened was within the expected range and, if not, what the governance structure should have caught.

Component 3: A defined escalation path with board visibility.

The escalation path from an AI incident to the board should be documented before the incident occurs. Not an ad hoc escalation when something significant happens — a standing protocol that defines: what the threshold for board notification is, who notifies the board, in what format, on what timescale.

Boards should not be learning about AI incidents from external sources before they are informed by their own governance structure. This happens more often than it should, and it happens because the escalation path was not defined before the incident.

Component 4: A periodic causal governance review.

Every twelve months, the board should receive a causal governance review: a systematic assessment of the AI governance structure against the incidents that have occurred in the period, the near-misses that were caught by the oversight mechanism, and the governance gaps that either produced incidents or are likely to produce incidents if not addressed.

The question the review answers is not “what happened to our AI systems.” It is “what does the pattern of outputs, anomalies, and escalations tell us about the structural gaps in our AI governance.”


What this looks like in practice for a mid-sized company

For a mid-sized company with AI deployments in one or two production contexts — a customer-facing recommendation system, an internal process automation tool, an HR or finance decision support system — the pre-incident causal review process does not require dedicated infrastructure.

It requires:

  • A one-page incident register template maintained by the responsible operations team
  • A causal model document for each material deployment, completed before the deployment goes to board approval — one to two pages per deployment
  • A quarterly summary to the AI governance executive owner, with any threshold-exceeding items escalated to the board in the next meeting cycle
  • An annual causal governance review as a standing board agenda item, using the twelve months of incident register data as the input

The total governance overhead is approximately 20-30 hours of staff time per quarter and one hour of board time per quarter. The governance return is a standing early warning system for AI failures, a documented evidence base that would satisfy regulatory review, and an escalation mechanism that means the board is not the last to know.


The Board AI Governance Framework includes a pre-incident causal review protocol — the incident register template, the causal model documentation structure, and the escalation threshold definitions that give the board the governance infrastructure to manage AI risk before incidents occur rather than after them. The AI Readiness Assessment includes a diagnostic for whether this infrastructure exists and is functioning.

For independent advisory support on AI governance infrastructure, contact Steven directly.

Steven Vaile

Steven Vaile

Board technology advisor and QSECDEF co-founder. Writes on AI governance, quantum security, and commercial strategy for boards and deep tech founders.