━━ failproof ai journal · vol. 01

blog

field writing from the team behind failproof - on agent failure, durability, and the new shape of production AI.

● 13 posts● updated apr 2026

latest№13

Apr 25, 2026·10 min read·Nivedit Jain

the agent didn't fail. it was just told too much, too soon.

Why progressive disclosure is the most underrated concept in agent reliability, and why hooks are the primitive that finally makes it possible.

read post →

━━ archive · all posts

№12Apr 23, 20268 min read
the competitor vertical ai founders refuse to name
A wake-up call for vertical AI companies. The reliability gap is the unnamed competitor - and it's eating market share quietly.
→
№11Apr 13, 20269 min read
stop praying in prompts. start enforcing with hooks.
Git figured this out decades ago and agents are about to go further. The enforcement gap is where reliability dies.
→
№10Apr 06, 20267 min read
the future of reliable orchestration
What does infrastructure designed from the ground up for AI agents actually look like? Pre-warmed sessions, direct context streaming, stateful routing, sandboxed isolation, and a unified control plane.
→
№09Apr 05, 20268 min read
retry is not enough: the 3u framework for agentic reliability
In traditional software, retries fix transient errors. In AI agents, failures are cognitive. Observability after the fact doesn't help when every failure is unique. The 3U framework introduces a new model for reliability: Uncover, Understand, Utilize.
→
№08Apr 03, 202614 min read
why temporal breaks when you put an ai agent inside it
Temporal is the industry standard for durable execution. We tried using it as the durability layer for Claude Agents. Here are the 12 specific ways it falls apart.
→
№07Apr 01, 20266 min read
what does durability mean for agentic software?
Durability frameworks were built for deterministic workflows. Agents are probabilistic, stateful, and long-running. Here's why the old playbook breaks and what should replace it.
→
№06Mar 27, 202611 min read
observability was built for servers. agents need oversight.
The industry has spent 18 months building increasingly sophisticated observability tooling for agents: LLM-native traces, session replays, eval pipelines. It's still not enough.
→
№05Mar 23, 20268 min read
two env vars and done: claude code for every developer, zero api keys
Open source to allow every developer to use all possible Anthropic models and credits across clouds - without juggling API keys.
→
№04Mar 13, 202625 min read
stop shipping code. start building factories.
The next generation of software companies will not be defined by the code they write, but by the automated assembly lines they build.
→
№03Mar 11, 202612 min read
missing sauce for your agents
A thought piece on engineering culture, agentic systems, and what separates great products from everything else.
→
№02Mar 06, 202622 min read
the future is saaas (subagent as a service)
A thought piece on how the most important companies of the next decade won't build software applications - they will become subagents.
→
№01Feb 15, 20262 min read
launching the aire principles: industry standards for ai agent reliability
The first open framework translating SRE practices into actionable standards for production AI systems. Five principles, measurable metrics, and a proven path to agent reliability.
→

━━ guides

g·1guide9 min
ai failure handling for production agents
the five failure categories, why retries don't cut it, and how harness hooks become the enforcement primitive.
→
g·2guide10 min
llm agent reliability - closing the production gap
the reliability gap between demo and production, the 3u framework, and hook-level enforcement across every supported harness.
→
g·3guide8 min
agent error recovery - detect, mitigate, continue
the retry / repair / block taxonomy and the recovery strategy mapped to every failure mode failproof catches.
→
g·4how-to8 min
how to stop claude code from running dangerous commands
install failproof, register the PreToolUse hooks, and the destructive command classes - rm -rf, sudo, curl | sh, force push, terraform destroy - stop at the hook layer.
→
g·5how-to10 min
claude code safety hooks - complete setup guide
PreToolUse, PostToolUse, Stop. the 39 built-in policies grouped by category. three configuration scopes. custom JS policies.
→
g·6how-to6 min
how to block rm -rf in claude code
the block-rm-rf policy in detail - every recursive flag shape it catches, the safe-path allowlist, and what the agent sees when a delete is denied.
→
g·7how-to7 min
how to prevent claude code from accessing .env files
two PreToolUse blockers plus five PostToolUse sanitizers - keep secrets out of the agent context window, even when the agent legitimately reads a related file.
→
g·8how-to7 min
how to prevent ai agent force push
block-force-push, block-push-master, block-work-on-main. one engine, every harness - claude code, codex, cursor, gemini cli, copilot, picode, opencode.
→
g·9how-to9 min
how to sandbox claude code terminal access
policy sandbox without a container or vm - PreToolUse confinement, PostToolUse sanitization, Stop gates. stack a container on top only when you need a kernel boundary too.
→