━━ failproof ai journal · vol. 01
blog
field writing from the team behind failproof - on agent failure, durability, and the new shape of production AI.
the agent didn't fail. it was just told too much, too soon.
Why progressive disclosure is the most underrated concept in agent reliability, and why hooks are the primitive that finally makes it possible.
read post →━━ archive · all posts
- №12
the competitor vertical ai founders refuse to name
A wake-up call for vertical AI companies. The reliability gap is the unnamed competitor - and it's eating market share quietly.
→ - №11
stop praying in prompts. start enforcing with hooks.
Git figured this out decades ago and agents are about to go further. The enforcement gap is where reliability dies.
→ - №10
the future of reliable orchestration
What does infrastructure designed from the ground up for AI agents actually look like? Pre-warmed sessions, direct context streaming, stateful routing, sandboxed isolation, and a unified control plane.
→ - №09
retry is not enough: the 3u framework for agentic reliability
In traditional software, retries fix transient errors. In AI agents, failures are cognitive. Observability after the fact doesn't help when every failure is unique. The 3U framework introduces a new model for reliability: Uncover, Understand, Utilize.
→ - №08
why temporal breaks when you put an ai agent inside it
Temporal is the industry standard for durable execution. We tried using it as the durability layer for Claude Agents. Here are the 12 specific ways it falls apart.
→ - №07
what does durability mean for agentic software?
Durability frameworks were built for deterministic workflows. Agents are probabilistic, stateful, and long-running. Here's why the old playbook breaks and what should replace it.
→ - №06
observability was built for servers. agents need oversight.
The industry has spent 18 months building increasingly sophisticated observability tooling for agents: LLM-native traces, session replays, eval pipelines. It's still not enough.
→ - №05
two env vars and done: claude code for every developer, zero api keys
Open source to allow every developer to use all possible Anthropic models and credits across clouds - without juggling API keys.
→ - №04
stop shipping code. start building factories.
The next generation of software companies will not be defined by the code they write, but by the automated assembly lines they build.
→ - №03
missing sauce for your agents
A thought piece on engineering culture, agentic systems, and what separates great products from everything else.
→ - №02
the future is saaas (subagent as a service)
A thought piece on how the most important companies of the next decade won't build software applications - they will become subagents.
→ - №01
launching the aire principles: industry standards for ai agent reliability
The first open framework translating SRE practices into actionable standards for production AI systems. Five principles, measurable metrics, and a proven path to agent reliability.
→
━━ guides
- g·1
ai failure handling for production agents
the five failure categories, why retries don't cut it, and how harness hooks become the enforcement primitive.
→ - g·2
llm agent reliability - closing the production gap
the reliability gap between demo and production, the 3u framework, and hook-level enforcement across every supported harness.
→ - g·3
agent error recovery - detect, mitigate, continue
the retry / repair / block taxonomy and the recovery strategy mapped to every failure mode failproof catches.
→ - g·4
how to stop claude code from running dangerous commands
install failproof, register the PreToolUse hooks, and the destructive command classes - rm -rf, sudo, curl | sh, force push, terraform destroy - stop at the hook layer.
→ - g·5
claude code safety hooks - complete setup guide
PreToolUse, PostToolUse, Stop. the 39 built-in policies grouped by category. three configuration scopes. custom JS policies.
→ - g·6
how to block rm -rf in claude code
the block-rm-rf policy in detail - every recursive flag shape it catches, the safe-path allowlist, and what the agent sees when a delete is denied.
→ - g·7
how to prevent claude code from accessing .env files
two PreToolUse blockers plus five PostToolUse sanitizers - keep secrets out of the agent context window, even when the agent legitimately reads a related file.
→ - g·8
how to prevent ai agent force push
block-force-push, block-push-master, block-work-on-main. one engine, every harness - claude code, codex, cursor, gemini cli, copilot, picode, opencode.
→ - g·9
how to sandbox claude code terminal access
policy sandbox without a container or vm - PreToolUse confinement, PostToolUse sanitization, Stop gates. stack a container on top only when you need a kernel boundary too.
→