u/morphAB

AI agent governance still defaults to a kill switch, and the gap is on the authorization side

AI agent governance still defaults to a kill switch, and the gap is on the authorization side

Hey everyone! observation from working in authorization: Identity programs have been putting serious work into agent authentication over the last couple of years, service accounts done properly, OAuth scopes tightened, secret rotation, short-lived tokens. The authN side isn't fully solved (it never is) but it's where most of the effort has been going..

The part getting less air-time is what happens after the agent is authenticated, when it's acting on a workflow and something starts looking off. The default plan there is still "if it misbehaves, kill the agent."

That stops working the moment the agent is wired into something real. Pulling the switch creates a secondary incident, halted workflows, paused queues, downstream teams scrambling. So the agent keeps running at full access while the team figures out what's wrong, because the standard toolkit doesn't have a middle setting.

A colleague of mine was talking to a CISO about this and the framing that CISO used was dimmer switch, not kill switch. The dimmer lives in the authZ layer at runtime, which is the part identity stacks haven't extended into yet for non-human principals.

In practice the dimmer looks like read-only on certain data first. Sensitive tools dropped next. Higher approval thresholds for anything above a certain size. Each adjustment is reversible and logged. If the agent turns out to be fine, restrictions fade back. if not, you keep tightening until access is at zero, but you got there deliberately and with a record

mechanism isn't new - per-action policy enforcement at runtime has been around for years for human users. What's newer for AI agents specifically is wiring it to the agent's identity, current task, and intent at runtime, so you can narrow scope without redeploying or stopping the agent mid-task.

My team and I (work at Cerbos) wrote up the full framing here: https://www.cerbos.dev/blog/dimmer-switch-not-a-kill-switch-rethinking-ai-agent-governance

Now i'm curious to know how identity programs you all are seeing / part of, are organizing this. Is agent authorization landing inside the iam team, security ops, the application teams, or sitting in no man's land between them? If you're open to sharing - please do!

Usual caveat, none of this replaces human review of policy. Tooling makes the revocation mechanical. Humans still own the call on where the boundaries should sit :)

u/morphAB — 3 days ago

Most AI agent governance playbooks still assume you can turn the agent off... Once its wired into production that stops being true [Rethinking AI security through a dimmer switch lens]

Hey everyone! observation from working in authorization: the default plan I have been seeing for "what if the AI agent misbehaves" is some version of "kill the agent."

That's fine for sandboxes. But for anything integrated into real workflows, such as claims, support, data writes, etc - pulling the switch creates a secondary incident, sometimes worse than the original. (queues halt, compliance windows slip, the team relying on the agent's output is scrambling.)

A colleague of mine was talking to a CISO recently and the framing that CISO used was dimmer switch, not kill switch.

What that looks like in practice is narrowing what the agent can do, not switching it off. Read-only on certain data first. Sensitive tools dropped next. Higher approval thresholds for anything above a certain size. each adjustment is reversible and logged. If the agent turns out to be fine, the restrictions fade back. If it doesn'y -> you keep tightening until access is at zero, but you got there deliberately and with a record.

The mechanics aren't new - per-action policy enforcement has been around for years in policy-as-code stacks. The part that's newer is tying it to the agent's identity and intent at runtime, so when something looks off you can narrow scope without redeploying or stopping the agent in the middle of work

Plenty of teams already have circuit breakers, rate limits, tool allowlists. Those help, but they tend to be blunt : full-access or off, no middle. The dimmer is what sits between those two states, and it's the part most agent governance plans I've seen don't actually include, unfortunately.

I'm vendor-side (work at Cerbos) so not dropping a link :) Happy to share the writeup in DMs or comments if useful. Wanted to put the framing out there because most IR playbooks I've seen still default to the kill switch, and the gap is going to start mattering as agents move past copilot work.

Would be really intersting to hear how the community here is handling having to revoke without creating a worse incident

reddit.com
u/morphAB — 4 days ago