What counts as an “ai failure”?
Most production teams talk about ai agent failure in two registers: the obvious one (the agent crashed, the model returned a 5xx) and the one that actually costs money (the agent didn't crash - it confidently did the wrong thing). The second is the interesting one. Five concrete categories cover ~95% of what we see in the wild:
- bad llm tool calls.The model invents a file path, a function, or a tool that doesn't exist, and the harness happily attempts the call.
- hallucination. The model produces a plausible fact, citation, or piece of code that is wrong, and the agent acts on it.
- context drift. The agent wanders off the original goal, often after a long-running plan.
- runaway loops. The agent gets stuck repeating itself, or retries the same failing call indefinitely.
- destructive shell. The agent decides
rm -rf,git push --force,terraform destroy, or aDROP TABLEis the right next step.
Each of these is a different failure shape with a different mitigation. Lumping them under “retry on error” is the thing that breaks agents in production.
Why retries aren't enough
Retry-on-error is the playbook software engineers brought over from the request/response era. It works when the failure is transient: the network blipped, the upstream service was slow, the next attempt succeeds. But most ai agent failures are cognitive, not transient. Retrying a bad tool call gives you the same bad tool call. Retrying a hallucinated answer gives you a confidently rephrased hallucination. The agent doesn't need another attempt - it needs a different one.
The hook is the missing primitive
Every major coding-agent harness - Claude Code, Codex, Gemini CLI, Copilot, picode, opencode - already exposes hooks for pre-tool and post-tool calls, prompts, and stop conditions. Hooks are how the harness lets you observe and modify the agent's next move without touching the agent.
Most teams ignore them. They write longer system prompts instead, hoping the model will behave. That's praying in prompts. Hooks are enforcement. The same way git pre-commit hooks let you block bad commits without asking the author to please be careful, harness hooks let you block bad tool calls without asking the model to please not hallucinate.
How failproof ai does it
failproof ai is a local process that subscribes to the harness hooks your agent already calls. When a hook fires, failproof runs the trace through 39 built-in policies. Each policy detects one of the failure categories above and applies the right mitigation inline:
- bad tool call → block the invocation, return a short structured response telling the agent what does exist nearby.
- hallucination→ flag the trace, surface the specific fact that's wrong, ask the agent to recheck.
- drift → inject a short reminder of the original goal back into the context window.
- loop → detect the repetition, summarize what the agent has tried, restart from a clean state.
- destructive shell → block, page a human, log everything.
failproof never sits in the request path. It piggybacks on the hooks the harness already calls, runs in a separate local process, and adds zero latency to the agent loop. Your agent's reasoning never leaves your machine.
Get started
failproof ai is free and open-source. Install the cli, enable the built-in policies, and the dashboard runs at localhost:8020.
Related reading
- llm agent reliability - closing the production gap
- agent error recovery - detect, mitigate, continue
- retry is not enough: the 3u framework
- stop praying in prompts. start enforcing with hooks.
- the agent didn't fail - it was told too much, too soon
- claude code safety hooks setup
- stop claude code from running dangerous commands
- sandbox claude code terminal access