u/Ok_Top_5458

Open-sourcing a shell-level security layer for AI agents

After working with AI agents for a while, I kept running into the same issue:

eventually the agent ignores boundaries, reads .env files, touches production resources, or uses secrets it was never supposed to access.

Even with MCP read-only setups and carefully written prompts, the shell itself is still trusted too much.

So I started building a shell-level control layer for AI agents:

  • block or sanitize dangerous commands
  • expose virtual/fake secrets instead of real ones
  • separate DEV / PROD access policies
  • restrict network/domain access
  • enforce runtime policies instead of relying only on prompts

The goal is to make agents safer and more deterministic inside real developer environments.

I’m now open-sourcing it and looking for people who use Claude Code, Codex, Cursor, etc. to try breaking it on real workflows.

Feedback, criticism, and attack ideas are very welcome.

link to PyPI in the comments

reddit.com
u/Ok_Top_5458 — 14 hours ago
▲ 2 r/Agent_AI+1 crossposts

How do you stop terminal AI agents from reading .env or touching prod?

I think there’s a real problem with AI coding agents getting too much trust once they run from the terminal.

Even if you give them clear instructions, MCP tools, or read-only access, they can still sometimes reach things you didn’t really mean to expose — like .env files, production keys, internal URLs, or commands that are technically available but shouldn’t be used.

My current thinking is that the solution shouldn’t only be “better prompting”.
There needs to be some hard boundary at the shell/environment level:

  • hide or replace sensitive env values
  • separate dev keys from production keys
  • block risky commands before they run
  • control which domains/tools the agent can access

Curious if other people here ran into this problem too.

reddit.com
u/Ok_Top_5458 — 1 day ago