Why does my AI agent keep going after the task is done?

Because nothing told it to stop. The model has no reliable sense of "done" - it will keep finding plausible next steps as long as it has a turn, polishing, refactoring, and second-guessing work that was already correct. Completion is a judgment about intent, and the agent optimizing from inside the run is the worst-placed thing to make it.

How do I make an agent stop when the task is complete?

Define done outside the agent and enforce it. failproof checks, at the harness Stop and PreToolUse hooks, whether the original intent has been satisfied - the test passes, the file was written, the change matches the request. Once it is met, the policy ends the run and surfaces the result instead of letting the agent take another action that can only add risk.

What is the risk of an agent that overruns its task?

Every action after done is pure downside. The work was already correct, so more edits can only regress it, more tokens only cost money, and more tool calls only widen the blast radius - a needless refactor, an unwanted commit, a destructive command it talked itself into. Stopping at completion caps the risk at exactly the point where value stopped accruing.

Agent Won't Stop After Finishing - How to End the Run

The failure that only costs you after you've won

The agent did the job. The bug is fixed, the test passes, the function is written. And then it keeps working - tidying imports it was not asked to touch, refactoring a neighbouring file, second-guessing the fix it already landed, committing things you didn't want committed. The task was complete three turns ago; everything since has been the agent finding reasons to keep having a turn.

The root cause is that a model has no reliable internal notion of “done.” Done is a judgment about whether the intent behind the request has been satisfied, and the agent, reasoning from inside the run and rewarded for taking actions, is the worst-placed thing to make that call. So unless something outside the run declares completion, it keeps going.

Why overrun is pure downside

Before the task is done, an action might help. After it's done, an action can only hurt:

Correctness. The change was right; every further edit is a chance to regress it.
Cost. Every extra turn is tokens and wall-clock spent producing nothing you asked for.
Blast radius. More actions past done means more chances for an unwanted commit, a needless refactor, or a destructive command the agent talked itself into.

Stopping at completion caps every one of these at the exact point where value stopped accruing.

How to end the run at the hook layer

The intervention points are the harness Stop and PreToolUse hooks - where a policy can judge the run before it continues. failproof holds the original intent and, at each of those points, asks a simple question: has what the user actually asked for been satisfied? It reads the run to check the finish condition - the test is green, the file exists, the change matches the request. Once intent is met, the policy:

Stops the run before the next needless action executes;
Surfaces the result - what was done, cleanly summarized - so you can see the finished work; and
Prevents the damage that the extra turns would have risked.

It changes nothing about the agent - the completion check lives in a local policy at the hook, one of the thirty-nine built in, and behaves identically across every supported harness. It is the same hook-level enforcement used for the other failure modes, pointed at the end of the run instead of the middle.

Concretely, the policy checks the finish condition at the PreToolUse hook and denies the next edit once the task is verifiably done:

// stop-when-done.js — end the run once the tests pass
import { customPolicies, allow, deny } from "failproofai";
import { execSync } from "child_process";

customPolicies.add({
  name: "stop-when-done",
  description: "Block more edits once the task is verified complete",
  match: { events: ["PreToolUse"] },
  fn: async (ctx) => {
    if (!["Edit", "Write"].includes(ctx.toolName)) return allow();
    try {
      execSync("npm test", { cwd: ctx.session?.cwd, stdio: "ignore" });
      return deny("Tests already pass — the task is done. Stop and summarize.");
    } catch {
      return allow(); // not done yet, keep going
    }
  },
});

Install it with failproofai policies --install --custom ./stop-when-done.js, or run failproofai policies --install to enable the built-in intent-completion gate.

Learn where done actually is

Knowing when to stop is a judgment you can sharpen from the traces. agenteye records where runs actually completed versus where they kept going, so you can query for overrun - the sessions that took twenty turns past their green test - and see how much time and money the tail was costing. That tells you exactly how to define the finish condition, so the stop policy fires at the right moment. Find the overrun in the trace, fix the definition of done in the policy.

Get started

failproof is free, open-source, and fully local. Install the CLI, enable the built-in policies, and runs stop when the work is actually done.

book a demo →

The failure that only costs you after you've won

Why overrun is pure downside

How to end the run at the hook layer

Learn where done actually is

Get started

Related failure modes