决策指南

How much should your agent decide on its own?

Four questions about the stakes, and a clear answer about how long a leash your agent should get.

本页面尚未完全翻译为您的语言——正在显示英文版本。

The exciting move is to let an agent run free; the safe one is to keep it on a short leash. The right length isn't set by how clever the model is — it's set by the stakes of a single mistake: how recoverable it is, and how bad it gets. The honest default is the least autonomy the task can tolerate, earned upward deliberately as it proves itself. Never let an irreversible, serious action run without a human in the loop, bound the blast radius before you grant the capability, and remember the one thing that never changes with the leash: a named human owns the outcome, because the agent can't.

If it does the wrong thing, can the action be undone?
How bad is the worst single mistake?
How often does it act?
Is the domain regulated, or does a mistake create liability?

回答所有问题以查看建议。

所有选项一览

Approve every action

The agent proposes; a human says yes before anything consequential happens. The tightest leash, and the right one when a single mistake is both unrecoverable and serious and you act rarely enough to afford the review. Slow by design — that's the point.

在以下情况选择它

  • A mistake can't be undone and the damage is serious
  • It acts rarely enough that a human can review each consequential step
  • You're in a regulated or high-liability domain and volume is low

权衡

  • Slow and human-bound — it won't scale past what a person can review
  • Tempting to rubber-stamp once it's usually right; the review has to stay real
  • Overkill for reversible or low-stakes work — don't gate what you can undo

Run it, watch it, intervene

The agent runs on its own; a human monitors and steps in when something looks wrong, instead of approving each step. The right call when a mistake is serious but recoverable — because you can undo it, watching beats gating. You move from approving the process to owning the outcome.

在以下情况选择它

  • Mistakes are serious but reversible — you can step in and roll back
  • Volume is too high to approve each action, but you can supervise
  • You want speed without flying fully blind

权衡

  • Only works if you can actually see what it's doing — instrument it first
  • "Monitoring" quietly decays into not looking; flag uncertainty for real
  • You need a fast, reliable way to pause or roll back when you spot trouble

Autonomous inside hard limits

The agent acts on its own, but only within limits you set in advance — spend caps, allowlists, scoped permissions — and anything big, unusual, or out-of-policy stops and escalates to a human. The workhorse setting: fast and independent on the routine, walled off from the catastrophic. Bound the blast radius, then let it run.

在以下情况选择它

  • High volume, where approving each action is impossible
  • An irreversible action whose worst case you can cap to something survivable
  • A regulated domain that needs a floor of control, audit, and human override

权衡

  • Only as safe as the limits — set them too loose and it's just full autonomy
  • You have to define "out of policy" well enough that the right things escalate
  • Needs an escalation path a human actually watches, or the escalations pile up unseen

Let it run

The agent acts freely, no per-action approval, no constant watching. Reserved for work where a mistake is both reversible and small — cheap to be wrong, easy to undo. The reward for genuinely low stakes is genuine speed. Don't confuse it with letting an agent loose on anything that matters.

在以下情况选择它

  • A mistake is both reversible and minor
  • Volume is high and per-action oversight would just be friction
  • Nothing here is regulated or creates liability

权衡

  • Wrong for anything irreversible, serious, or regulated — re-check those answers
  • Still needs logging so you can see what it did after the fact
  • Stakes creep: a low-stakes task can grow into a high-stakes one — revisit it