u/Ok-Pepper-2354 — reddlx

We built Agyn, an open-source Kubernetes-native runtime for AI agents

Hello folks,

I've been working on Agyn, an open-source Kubernetes-native runtime for deploying AI agents on your own infrastructure. Self-hosted, model-agnostic.

When you'd want this: you built different agents for different departments, and the question becomes how to deploy them, provide access for specific teams, and control them at enterprise level.

What it does:
- Define agents in Terraform, deploy to your existing K8s cluster
- Each agent and each mcp in in its own container with separated secrets
- Serverless runtime: agents spin up on demand, scale to zero when idle
- Per-agent / per-team token usage tracking
- OpenZiti overlay so agents reach internal databases without VPNs
or public exposure
- Ships with pre-built agents: Claude Code, Codex, and our own

Built on Go + Kubernetes, with OpenFGA for ReBAC and OpenZiti for networking. AGPL-3.0.

GitHub: https://github.com/agynio/platform

Would love feedback, especially on deployment of AI agents to K8s

u/Ok-Pepper-2354 — 1 day ago

▲ 14 r/LLMDevs

I’m begging you, don’t give an agent the same access rights you have

If you're building an agentic system inside your company, please read this.

I've spent the last two weeks interviewing companies doing exactly that, and I keep seeing the same pattern:

> The agent works for the user, so it gets the user's permissions.

I get it. It looks obvious. Reuse the identity you already have, inherit the scope from the human, ship the demo. Path of least resistance. But it's a bomb for the future, and it's also how you ship a privilege escalation feature dressed up as an AI assistant.

It is not my personal opinion, The Australian Cyber Security Centre puts a privilege problem at the top of the risk list. But most teams still give agents the same access rights as employees.

Here's what breaks the moment you nest your rights into your agent:

You can do things you don't want an agent doing on your behalf.
You can merge to main. You can `terraform apply`. You can drop tables. The whole point of having those rights is that you decide when to use them. Cloning them into an agent means a prompt injection in some random README is one tool call away from production. The agent doesn't need your full keyring. It needs a small, scoped one.
The audit log lies.
Once the agent acts as you, your logs say "Tom ran this query at 3am." Did Tom run it? Did his agent? You can't tell. SOC 2, SOX, anything that cares about attribution will broken by default.
Sub-agents inherit and the chain explodes.
Planner spawns coder spawns reviewer. If each one runs with the parent's rights, you've built an unbounded delegation chain with no permission boundary. If each one runs as the original human, even worse. One agent can ask another one to approve his actions in some system.
Some agent jobs need rights no human on the team should have.
Finance wants an agent that can query the warehouse to answer revenue questions. The right answer is "the agent has read access; the team does not." Nested permissions force the opposite, grant a human the access first so the agent can inherit it.
Least privilege only works if the agent has its own identity.
You want a research agent that reads but doesn't write. A deploy agent that hits staging but not prod. Both might "belong to" the same engineer.

This is also what ACSC, NIST AI RMF, and basic least-privilege design have been saying for a while. Please do not allow your engineers give the same access to agents and thinking that it is just a tool for an employee.

Would love to heat your story. May be some of you already faced that.

reddit.com

u/Ok-Pepper-2354 — 7 days ago

▲ 2 r/LLMDevs

20% reasoning drop when incorrect drafts are in your context. Experienced that?

Self-refinement loops always felt slightly suspect to me. Putting failed attempts back in context and asking the model to do better never quite added up. Princeton just measured what actually happens.

What the authors wanted to test

Most agent design and post-training pipelines rest on one assumption: that models can reflect on past mistakes and produce better answers. Self-refinement, reflection loops, retry-on-failure patterns all sit on top of this idea. The paper checks whether it actually holds.

Main results

11 models tested (GPT-5, Gemini 3 Pro, Qwen3-8B/32B, GPT-OSS-20B/120B, DeepSeek-R1-distilled, others) on 8 reasoning benchmarks (AIME, HMMT, GPQA, MMLU-Redux, CRUXEval-I, Game of 24). Setup: insert 1 or 2 incorrect drafts in context, compare to clean-slate.

Accuracy drops 10 to 20% when wrong drafts sit in context. Smaller models hit harder: GPT-OSS-20B loses ~31% on AIME24.
Telling the model "this draft is wrong, don't copy it" doesn't help. Performance still drops.
Even when the model itself correctly identifies the draft as wrong, the bias persists.

What I took from it

The failure is architectural. Attention reuses reasoning structures it sees in context, so bad reasoning transfers even when the model "knows" it's wrong. You can't prompt your way out. The prompt is what's getting dragged in the first place.

Practical takeaway: many agent stacks retry by showing the model its failed attempt and asking it to fix it. The paper shows this often hurts more than it helps. The alternative is just running the task from scratch.

PS paper - Contextual Drag (ICLR 2026 RSI workshop)

reddit.com

u/Ok-Pepper-2354 — 10 days ago