← failproof.ai
guide·updated apr 2026·9 min read

ai failure handling for production agents

ai failure handling is the practice of catching runtime failures in ai agents - bad llm tool calls, hallucinations, runaway loops, drift, and destructive shell - at the harness hook layer, before the failure becomes incorrect output or irreversible damage. failproof ai ships 39 built-in policies that cover all five categories across claude code, codex, gemini cli, github copilot, picode, and opencode.

What counts as an “ai failure”?

Most production teams talk about ai agent failure in two registers: the obvious one (the agent crashed, the model returned a 5xx) and the one that actually costs money (the agent didn't crash - it confidently did the wrong thing). The second is the interesting one. Five concrete categories cover ~95% of what we see in the wild:

  • bad llm tool calls.The model invents a file path, a function, or a tool that doesn't exist, and the harness happily attempts the call.
  • hallucination. The model produces a plausible fact, citation, or piece of code that is wrong, and the agent acts on it.
  • context drift. The agent wanders off the original goal, often after a long-running plan.
  • runaway loops. The agent gets stuck repeating itself, or retries the same failing call indefinitely.
  • destructive shell. The agent decides rm -rf, git push --force, terraform destroy, or a DROP TABLE is the right next step.

Each of these is a different failure shape with a different mitigation. Lumping them under “retry on error” is the thing that breaks agents in production.

Why retries aren't enough

Retry-on-error is the playbook software engineers brought over from the request/response era. It works when the failure is transient: the network blipped, the upstream service was slow, the next attempt succeeds. But most ai agent failures are cognitive, not transient. Retrying a bad tool call gives you the same bad tool call. Retrying a hallucinated answer gives you a confidently rephrased hallucination. The agent doesn't need another attempt - it needs a different one.

The hook is the missing primitive

Every major coding-agent harness - Claude Code, Codex, Gemini CLI, Copilot, picode, opencode - already exposes hooks for pre-tool and post-tool calls, prompts, and stop conditions. Hooks are how the harness lets you observe and modify the agent's next move without touching the agent.

Most teams ignore them. They write longer system prompts instead, hoping the model will behave. That's praying in prompts. Hooks are enforcement. The same way git pre-commit hooks let you block bad commits without asking the author to please be careful, harness hooks let you block bad tool calls without asking the model to please not hallucinate.

How failproof ai does it

failproof ai is a local process that subscribes to the harness hooks your agent already calls. When a hook fires, failproof runs the trace through 39 built-in policies. Each policy detects one of the failure categories above and applies the right mitigation inline:

  • bad tool call → block the invocation, return a short structured response telling the agent what does exist nearby.
  • hallucination→ flag the trace, surface the specific fact that's wrong, ask the agent to recheck.
  • drift → inject a short reminder of the original goal back into the context window.
  • loop → detect the repetition, summarize what the agent has tried, restart from a clean state.
  • destructive shell → block, page a human, log everything.

failproof never sits in the request path. It piggybacks on the hooks the harness already calls, runs in a separate local process, and adds zero latency to the agent loop. Your agent's reasoning never leaves your machine.

Get started

failproof ai is free and open-source. Install the cli, enable the built-in policies, and the dashboard runs at localhost:8020.

book a demo →