Kill switches and circuit breakers — the off-button you hope you never use — Production Harnesses — observability, recovery, the bill

The move. Design the off-button before the first incident, not during it.

Every team that ships an autonomous agent eventually has the moment.

The agent is doing something it should not be doing. Maybe it is sending emails it was not supposed to send. Maybe it is generating content in a loop and the cost is climbing. Maybe a user's session is stuck and retrying the same broken tool call every thirty seconds.

What does the team do?

If the kill switch was designed before launch, someone opens the control panel, scopes the intervention to the affected workflow or user, confirms the effect, and triggers it.

The intervention takes ninety seconds. The in-flight work is checkpointed; the agent pauses cleanly; nothing that was not broken gets broken.

If there is no kill switch, the options are: deploy a code change (ten minutes if you are fast, longer if you are not, and you are pushing a code change under incident pressure, which is where bad deploys happen), revoke an API key (blunt, takes down everything, and you have to re-provision it afterward), or kill the process on the server (blunt, loses in-flight work, and requires terminal access that your support staff probably does not have).

The gap between these two situations is entirely explained by whether the kill switch was designed before the incident or during it.

The picture

A control panel with four rows, each representing a tier of intervention. Row one: Global — a toggle that pauses all agent activity in the system. Effect label reads: "All agent sessions pause at next turn boundary. In-flight work is checkpointed. No new sessions start. Unpauses when toggled off." Row two: Feature — a dropdown to select the affected workflow (e.g., "resume-rewrite", "cover-letter-draft", "batch-scoring") plus a toggle. Effect label reads: "This workflow pauses. Other workflows continue." Row three: User — a user ID field plus a pause button. Effect label reads: "This user's sessions pause. Other users continue." Row four: Instance — a session ID field plus a kill button. Effect label reads: "This session terminates immediately. Work is checkpointed to the last turn boundary."

Each row shows the authorized user list (who can trigger this tier), the effect statement, and a confirmation dialog before activation. The confirmation dialog repeats the effect statement in plain language.

Why it matters now

In 2026 every team that ships an autonomous agent long enough eventually has the moment. The kill switch is not a theoretical resilience feature; it is a practical operational requirement.

The question is whether you designed it before the incident or are improvising during one.

The second reason kill switches matter now is user trust. In an era when autonomous agents are new enough that users are still calibrating their trust, the ability to immediately and precisely stop an agent's activity is a trust signal.

"We can pause just your session while we investigate" is a different statement than "we have to take down the whole service to fix this."

The circuit-breaker pattern from microservices literature applies directly. A circuit breaker monitors for failure conditions and opens automatically when thresholds are exceeded — no human required.

For agent systems, circuit breakers can automatically pause a session when cost exceeds a per-run cap, when a tool call fails more than N times in succession, or when the agent's turn count exceeds a safety limit.

Human-triggered kill switches and automated circuit breakers are complementary; design both.

A source you should trust

Feature-flag systems — LaunchDarkly, GrowthBook, Unleash — are the mature pattern for runtime-controllable behavior in production services. Kill switches are feature flags with specific semantics. If you already run a feature-flag system, your kill switches probably belong there. If you do not, a kill switch is a good reason to adopt one.

The circuit-breaker pattern from microservices literature (Martin Fowler's writing is canonical) explains the automatic-response version. The concept predates AI agents by a decade; the application to agent systems is direct.

A recipe

A four-tier kill-switch design for any production agent system:

Global switch. Pauses all agent activity. Requires explicit unpause — it should not restore itself after a deploy. Authorized users: on-call engineers, engineering leadership. Effect documented in plain language and displayed in the interface.
Feature switch. Pauses a specific named workflow. One flag per workflow. Authorized users: on-call engineers, support leads with appropriate access. The flag name should match the workflow name in the trace tags so you can correlate "what was running" with "what got paused."
User switch. Pauses agent activity for one customer. Useful when a user contacts support about unexpected behavior. Authorized users: support staff. Effect is scoped to that user; other users are unaffected.
Instance switch. Terminates one session. Useful for runaway loops or stuck sessions. Authorized users: on-call engineers. Effect is immediate; in-flight work is checkpointed to the last completed turn boundary, not lost entirely.

For each tier: document the authorized user list, write the effect statement ("when triggered: agent X stops, B continues, in-flight work is checkpointed"), build the non-engineer interface, and test it before launch. An untested kill switch is a kill switch you will not trust during an incident.

Add automated circuit breakers on top: per-run cost cap (automatically pauses the session when cost exceeds the threshold), tool-call failure rate (automatically pauses when a tool fails more than N consecutive times), and turn-count limit (automatically pauses when a session exceeds M turns without completion).

The smell of it going wrong

The only off-button is "deploy a code change." This means every incident that requires stopping an agent requires a deployment. Deployments under incident pressure are where bugs get introduced. If your kill switch requires a deployment, your kill switch is not a kill switch.

Kill switches exist but are not accessible to support staff during an incident. The switch lives in a Kubernetes dashboard that only engineers can access. An incident at 3am involves a support person who cannot trigger the switch and an engineer who is being woken up for a task that should not require engineering access.

Triggering a kill switch loses in-flight work because there is no checkpoint integration. The switch kills the process; the agent was in the middle of turn 7 of a 10-turn task; the user lost all progress. They experience this as the product deleting their work.

The kill switch's effect is not written down. The on-call engineer triggers the global switch and is not sure whether it affects background batch jobs or only interactive sessions. They guess wrong and take down a batch job that was not related to the incident.

A judgment call from real work

PL runs feature-flag-style controls for some content-pipeline workflows. The course-scoring pipeline has a per-run kill switch — a flag that stops new scoring runs from starting — which was added after an incident where a misconfigured scoring run kept reprocessing the same lessons in a loop for forty minutes before someone noticed the cost meter moving.

But not all workflows are covered. The lesson-enrichment workflow, which uses a more complex multi-step agent, does not yet have a tiered kill switch. Stopping it requires a deploy or direct database intervention to clear the queue. That gap has been documented as a launch-blocker for the next feature that depends on that workflow.

The prioritization rule that emerged from this situation: any workflow that can run without direct user interaction — batch jobs, background enrichment, scheduled scoring — gets a kill switch before launch. Interactive user-facing workflows get a kill switch before launch as well, but the urgency is slightly lower because user complaints are the natural circuit breaker. A batch job can run overnight unnoticed; an interactive workflow tends to surface problems faster.

Kill switches stop damage in progress. The next lesson — recovery and checkpoints — covers what happens after the switch is thrown: how to resume the work that was paused without losing the progress made before the incident. The two disciplines are designed together; a kill switch without checkpoint integration is a switch that loses user work.

Rules from this lesson

Kill switches are designed before the first incident; a kill switch designed during an incident is a deployment, not a switch.
Tiered scope — global, feature, user, instance — covers the common cases without requiring overly blunt interventions.
Non-engineers must be able to trigger the switches they are authorized for; terminal access is not a support tool.
Test every kill switch before launch; an untested switch is one you will not trust when you need it.
Automated circuit breakers complement manual kill switches; design both — circuit breakers for cost cap and tool-failure rate, manual switches for the cases the circuit breaker does not anticipate.