The Six Failure Types in Agentic AI and How to Diagnose Them

I was at a company called RiverSoft in the early 2000s. The product was fault isolation software — network management tools that could take a stream of thousands of correlated fault alerts and identify the specific failure, or the specific chain of failures, that had caused a service outage. The principle underlying the product was that failures in complex systems are not random. They are patterned. The same failure types recur with recognisable signatures. If you know the signatures, you can route directly to the cause rather than working through all the possibilities from scratch.

RiverSoft was acquired by Micromuse. Micromuse was acquired by IBM. SMARTS, where I went next, did the same category of work for telecommunications networks. Voyence applied similar principles to military and enterprise network configuration management. Three companies, different domains, the same insight: failures are classified, and classification accelerates diagnosis.

I have applied the same principle to AI agent failures. Running 24 agents in a production system for two years produces a substantial failure sample. From that sample, six failure types emerge with enough consistency to be worth naming. None of the six are random. Each has a diagnostic signature. Each has a different fix.

Article illustration — six-failure-types-agentic-ai-diagnosis


Failure Type 1 — Specification gap

What it looks like: The agent produces an output that is internally coherent and confidently presented, but does not address what was actually needed. The agent did not misunderstand the task — it completed a task that was not the task intended.

Diagnostic signature: The output is on-topic but misses a specific dimension of the requirement. Often the output is impressive-looking, which is why specification gap failures are the most likely to slip through without being caught immediately.

Root cause: The specification left an ambiguity that the agent resolved through inference. The inference was plausible but incorrect.

Fix: Revise the specification to close the specific gap. Do not add general guidance — add the specific instruction that would have produced the correct output. Then add a negative example (anti-exemplar) showing what the specification gap looks like when it recurs.


Failure Type 2 — Boundary violation

What it looks like: The agent takes an action that is outside its defined scope — modifying something it was not supposed to modify, producing output in an area it was not assigned, or making a decision that was supposed to require human approval.

Diagnostic signature: The output is correct (or at least plausible) for the action taken, but the action itself should not have been taken. The agent believed it was acting within its mandate because the mandate was not explicit enough about exclusions.

Root cause: Scope boundaries were defined by what the agent should do, not by what it should explicitly not do. Agents without explicit exclusion boundaries fill the gap with reasonable interpretation.

Fix: Add explicit scope exclusions to the specification. “You do not modify production databases” is not implied by a specification that says “you collect and structure data.” It needs to be stated. For any action that is irreversible or consequential outside the agent’s primary scope, add an explicit approval gate.


Failure Type 3 — Interface mismatch

What it looks like: The upstream agent produces a correct output in its own terms, but the downstream agent cannot use it. The pipeline stalls, produces an error, or the downstream agent makes assumptions about the upstream output that produce incorrect results.

Diagnostic signature: The failure is at the handoff point. Both agents, inspected individually, appear to be functioning correctly.

Root cause: The interface between the two agents was underspecified. The upstream agent decided on an output format that made sense from its perspective. The downstream agent expected a different format.

Fix: This is the only failure type that requires fixing two specifications simultaneously. Define the interface explicitly — exact fields, exact format, exact required values — and update both specifications to reference it. Do not fix one end and leave the other to adapt.


Failure Type 4 — Hallucination in context

What it looks like: The agent introduces a specific claim, number, name, or fact that was not in its provided data and does not exist in verifiable reality. The claim is plausible and consistent with the surrounding content, which is why it survives initial review.

Diagnostic signature: A specific verifiable fact in the output cannot be traced back to the agent’s input data or any documented source. The fact is not obviously wrong — it is the kind of thing that could be true. This is what makes context hallucinations particularly dangerous compared to obvious nonsense.

Root cause: The agent was asked to produce content that required specific facts, and the data provided did not supply all the facts the agent inferred were needed. The agent filled the gap from its training data rather than flagging the gap.

Fix: Two parts. First, the specification must explicitly instruct the agent to mark any claim it cannot source to provided data as [NEEDS DATA] rather than inferring. Second, the upstream data collection process must be audited to identify why the gap existed. Hallucinations in production systems are usually data gap problems masquerading as model problems.


Failure Type 5 — Scope creep

What it looks like: The agent produces the required output and then continues to produce additional work that was not requested. The additional work may be useful. It was not authorised. And in a system with downstream agents, unrequested additional output creates downstream interference.

Diagnostic signature: The output contains everything it should contain, plus additional content, decisions, or recommendations the agent generated autonomously. Often presented as “additionally, I have also…”

Root cause: The agent’s specification was written entirely in terms of what it should do, without explicit instruction to stop at the boundary of its defined task. Agents tend to be helpful. Helpfulness without scope limits is a failure mode in production systems.

Fix: The specification should explicitly define what the output must contain (completeness requirement) and must not contain anything beyond it (scope limit). In practice: “Your output is [X]. Do not include recommendations, additional analysis, or work beyond [X]. If you identify something that seems relevant beyond your scope, flag it to the orchestrator rather than acting on it.”


Failure Type 6 — Memory failure

What it looks like: The agent repeats a known error that was previously identified and documented. The failure pattern is in the anti-exemplar library. The agent did not read it, or read it and did not apply it.

Diagnostic signature: The failure is not new. The same failure pattern was logged in a previous run, a fix was applied to the specification, but the agent is still producing the same failure.

Root cause: Either the specification update was insufficient (the fix did not close the actual gap), or the memory injection mechanism failed (the agent did not read its memory file or anti-exemplar library before running), or the fix was applied to the wrong specification (the failure recurred in a different context than the one that was fixed).

Fix: This is a meta-failure — a failure in the training loop rather than in the agent directly. Diagnose which of the three causes applies before changing anything. If the memory injection mechanism failed, fix the mechanism. If the specification fix was insufficient, revise it. If the fix was applied to the wrong scope, generalise it. Then verify by running the exact input that produced the original failure.


Using the classification in practice

The value of the six-type classification is not that it tells you what to fix. It tells you where to look.

When a failure occurs in the WAT system, the first diagnostic question is: which type is this? Not “what went wrong in detail” — that question is slower to answer and produces a longer path to the fix. Type identification routes directly to the specification component that is likely at fault.

The classification also prevents the most common debugging error: fixing the symptom instead of the type. A hallucination that appears to be a model capability failure (Type 4) leads, on investigation, to a data gap in the upstream collection process. Fixing the model specification would have been the wrong fix. Fixing the data collection process is the right one. The classification tells you which process to investigate.

For boards evaluating AI systems in their organisations: when an AI failure reaches your attention, ask which failure type it is. If the team presenting the postmortem cannot classify the failure, they have likely not found the root cause yet. A remediation plan that does not know which type of failure it is remediating may fix the symptom whilst leaving the structural cause intact.


The Board AI Governance Framework includes a governance review structure for AI incidents — the questions boards should ask after an AI failure to determine whether the root cause has been correctly identified and whether the remediation addresses the structural gap rather than the presenting symptom.

For boards or organisations reviewing AI agent failures or designing post-incident review processes, contact Steven directly.

Steven Vaile

Steven Vaile

Board technology advisor and QSECDEF co-founder. Writes on AI governance, quantum security, and commercial strategy for boards and deep tech founders.