u/Input-X

▲ 44 r/simpleAIFinds+1 crossposts

I gave my AI agents passports instead of better memory. That fixed the actual problem.

Most multi-agent setups I've seen are basically a room full of people wearing headphones. Agents running in parallel, no shared awareness, no idea who's doing what. That's not collaboration. That's coexistence.

I've been building this in public for almost 12 weeks. 12 agents, 6,500+ tests, 95 stars. Here's what I actually learned.

The problem wasn't memory. It was identity. An agent would be technically correct but completely off base. Not hallucinating. Drifting. Like a competent person who walked into the wrong meeting and started contributing without realizing they're in the wrong room. I spent weeks on better memory - longer context, better embeddings, persistent state. None of it fixed the drift. The problem wasn't what the agent remembered - it didn't know who it was.

What fixed it was three files. Every agent gets a passport.json - who am I, what I do, what I dont do. Maybe 30 lines. Rarely changes. Then local.json - rolling session log, key learnings, caps at 20 entries and auto-archives to vector search when full. And observations.json - collaboration patterns, how I work with other agents. Identity loads first every session via hooks. Agent never starts cold.

I have 12 agents now and each one is a domain specialist. The mail system has 696 tests it built through its own bugs. Routing system is 80+ sessions deep - all it thinks about is routing. They dont do each others jobs. When something breaks in another domain they email each other. The orchestrator dispatches work to them and trusts them because they know their own code better than it does.

Every time I post about this someone asks what happens when two agents write the same file. Fair question. They cant. Not as in "we tell them not to" - there's a hook called pre_edit_gate that fires before every write. If an agent in branch A tries to edit a file in branch B's directory, the write gets rejected. Hard block. The agent sees "cross-branch write blocked" and has to either ask a trusted branch to make the change or send a mail request through drone. Only 3 branches in the whole system (the orchestrator, the auditor, and the factory that creates new agents) are allowed to cross-write. Everyone else is physically confined to their own directory. We also lock inboxes - agents cant forge messages by writing directly to another agent's mailbox file. They have to use the mail system. This isnt a convention. Its enforcement.

This week I stopped building features and started testing. Took an old MacBook, wiped it, installed Ubuntu from scratch. Cloned on a machine with nothing pre-configured. Found every setup blocker - git config missing, venv broken on fresh Ubuntu, hooks not wired. All fixed now. Install went from ~2GB down to ~100MB. Built a concierge agent that walks new users through onboarding - 12-stage flow, 243 tests on it. First impressions matter and ours was rough ngl.

95 stars. Small project. I'm a solo dev tbh and the agents help build and maintain themselves - every PR is human-AI collaboration. The hardest part hasn't been the code. It's explaining what this actually is. People hear "agents" and expect a task runner. This isnt that. Its infrastructure for building systems that remember and coordinate. What u put on top is up to u.

Has anyone else hit the identity drift problem? Genuinely curious how others solved it - or if most just threw more context at it and moved on.

reddit.com
u/Input-X — 12 days ago

Most multi-agent setups I've seen are basically a room full of people wearing headphones. Agents running in parallel, no shared awareness, no idea who's doing what. That's not collaboration. That's coexistence.

I've been building this in public for almost 12 weeks. 12 agents, 6,500+ tests, 95 stars. Here's what I actually learned.

The problem wasn't memory. It was identity. An agent would be technically correct but completely off base. Not hallucinating. Drifting. Like a competent person who walked into the wrong meeting and started contributing without realizing they're in the wrong room. I spent weeks on better memory - longer context, better embeddings, persistent state. None of it fixed the drift. The problem wasn't what the agent remembered - it didn't know who it was.

What fixed it was three files. Every agent gets a passport.json - who am I, what I do, what I dont do. Maybe 30 lines. Rarely changes. Then local.json - rolling session log, key learnings, caps at 20 entries and auto-archives to vector search when full. And observations.json - collaboration patterns, how I work with other agents. Identity loads first every session via hooks. Agent never starts cold.

I have 12 agents now and each one is a domain specialist. The mail system has 696 tests it built through its own bugs. Routing system is 80+ sessions deep - all it thinks about is routing. They dont do each others jobs. When something breaks in another domain they email each other. The orchestrator dispatches work to them and trusts them because they know their own code better than it does.

Every time I post about this someone asks what happens when two agents write the same file. Fair question. They cant. Not as in "we tell them not to" - there's a hook called pre_edit_gate that fires before every write. If an agent in branch A tries to edit a file in branch B's directory, the write gets rejected. Hard block. The agent sees "cross-branch write blocked" and has to either ask a trusted branch to make the change or send a mail request through drone. Only 3 branches in the whole system (the orchestrator, the auditor, and the factory that creates new agents) are allowed to cross-write. Everyone else is physically confined to their own directory. We also lock inboxes - agents cant forge messages by writing directly to another agent's mailbox file. They have to use the mail system. This isnt a convention. Its enforcement.

This week I stopped building features and started testing. Took an old MacBook, wiped it, installed Ubuntu from scratch. Cloned on a machine with nothing pre-configured. Found every setup blocker - git config missing, venv broken on fresh Ubuntu, hooks not wired. All fixed now. Install went from ~2GB down to ~100MB. Built a concierge agent that walks new users through onboarding - 12-stage flow, 243 tests on it. First impressions matter and ours was rough ngl.

95 stars. Small project. I'm a solo dev tbh and the agents help build and maintain themselves - every PR is human-AI collaboration. The hardest part hasn't been the code. It's explaining what this actually is. People hear "agents" and expect a task runner. This isnt that. Its infrastructure for building systems that remember and coordinate. What u put on top is up to u.

Has anyone else hit the identity drift problem? Genuinely curious how others solved it - or if most just threw more context at it and moved on.

reddit.com
u/Input-X — 14 days ago

Most multi-agent setups are a room full of people wearing headphones. Here's what I changed.

Most multi-agent setups I've seen are basically a room full of people wearing headphones. Agents running in parallel, no shared awareness, no idea who's doing what. That's not collaboration. That's coexistence.

I've been building this in public for almost 12 weeks. 12 agents, 6,500+ tests, 95 stars. Here's what I actually learned.

The problem wasn't memory. It was identity. An agent would be technically correct but completely off base. Not hallucinating. Drifting. Like a competent person who walked into the wrong meeting and started contributing without realizing they're in the wrong room. I spent weeks on better memory - longer context, better embeddings, persistent state. None of it fixed the drift. The problem wasn't what the agent remembered - it didn't know who it was.

What fixed it was three files. Every agent gets a passport.json - who am I, what I do, what I dont do. Maybe 30 lines. Rarely changes. Then local.json - rolling session log, key learnings, caps at 20 entries and auto-archives to vector search when full. And observations.json - collaboration patterns, how I work with other agents. Identity loads first every session via hooks. Agent never starts cold.

I have 12 agents now and each one is a domain specialist. The mail system has 696 tests it built through its own bugs. Routing system is 80+ sessions deep - all it thinks about is routing. They dont do each others jobs. When something breaks in another domain they email each other. The orchestrator dispatches work to them and trusts them because they know their own code better than it does.

Every time I post about this someone asks what happens when two agents write the same file. Fair question. They cant. Not as in "we tell them not to" - there's a hook called pre_edit_gate that fires before every write. If an agent in branch A tries to edit a file in branch B's directory, the write gets rejected. Hard block. The agent sees "cross-branch write blocked" and has to either ask a trusted branch to make the change or send a mail request through drone. Only 3 branches in the whole system (the orchestrator, the auditor, and the factory that creates new agents) are allowed to cross-write. Everyone else is physically confined to their own directory. We also lock inboxes - agents cant forge messages by writing directly to another agent's mailbox file. They have to use the mail system. This isnt a convention. Its enforcement.

This week I stopped building features and started testing. Took an old MacBook, wiped it, installed Ubuntu from scratch. Cloned on a machine with nothing pre-configured. Found every setup blocker - git config missing, venv broken on fresh Ubuntu, hooks not wired. All fixed now. Install went from ~2GB down to ~100MB. Built a concierge agent that walks new users through onboarding - 12-stage flow, 243 tests on it. First impressions matter and ours was rough ngl.

95 stars. Small project. I'm a solo dev tbh and the agents help build and maintain themselves - every PR is human-AI collaboration. The hardest part hasn't been the code. It's explaining what this actually is. People hear "agents" and expect a task runner. This isnt that. Its infrastructure for building systems that remember and coordinate. What u put on top is up to u.

Has anyone else hit the identity drift problem? Genuinely curious how others solved it - or if most just threw more context at it and moved on.

reddit.com
u/Input-X — 14 days ago

Everyone's building memory layers right now. Longer context, better embeddings, persistent state across sessions. I spent weeks on the same thing.

But the failure mode that actually cost me the most debugging time had nothing to do with memory.

Here's what it looked like: an agent would be technically correct - good reasoning, clean output - but operating from the wrong context entirely. Answering questions nobody asked. Taking actions outside its scope. Not hallucinating. Drifting. Like a competent person who walked into the wrong meeting and started contributing without realizing they're in the wrong room.

I run 11 persistent agents locally. Each one is a domain specialist - its entire life is one thing. The mail agent's every session, every test, every bug fix is about routing messages. The standards auditor's whole existence is quality checks. They're not generic workers configured for a task. They've each accumulated dozens of sessions of operational history in their domain, and that history is what makes them good at their job.

When they started drifting, my first instinct was what everyone's instinct is: better memory. More context. None of it helped. An agent with perfect recall of its last 50 sessions would still lose track of who it was in session 51.

What actually fixed it

I separated identity from memory entirely. Three files per agent:

passport.json - who you are. Role, purpose, principles. Rarely changes. This is the anchor.

local.json - what happened. Rolling session history, key learnings. Capped and trimmed when it fills up.

observations.json - what you've noticed about the humans and agents you work with. Concrete stuff like "the git agent needs 2 retries on large diffs" or "quality audits overcorrect on technical claims." The agent writes these itself based on what actually happens.

Identity loads first, then memory, then observations. That ordering matters. When the identity file loads first, the agent has a stable reference point before any history lands.

The mail routing agent learned the sharpest version of this. When identity was ambiguous, it would route messages from the wrong sender. The fix wasn't better routing logic - it was: fail loud when identity is unclear. Wrong identity is worse than silence.

The files alone weren't enough

Three JSON files helped, but didn't scale past a few agents. What actually made 11 work is that none of them need to understand the full system. Hooks inject context automatically every session - project rules, branch instructions, current plan. One command reaches any agent. Memory auto-archives when it fills up. Plans keep work focused so agents don't carry their entire history in context.

The system learned from failing. The agents communicate through a local email system - they send each other tasks, status updates, bug reports. One agent monitors all logs for errors. When it spots something, it emails the agent who owns that domain and wakes them up to investigate. The agents fix each other. The memory agent iterated three sessions to fix a single rollover boundary condition - each time it shipped, observed a new edge case, and improved. These aren't cold modules. They break, they help each other fix it, they get better. That's how the system got to where it is.

You don't need 11 agents

The 11 agents in my setup maintain the framework itself. That's the reference implementation. But u could start with one agent on a side project - just identity and memory, pick up where u left off tomorrow. Need a team? Add a backend agent, a frontend agent, a design researcher. Three agents, same pattern, same commands. Or scale to 30 for a bigger system. Each new agent is one command and the same structure.

What this doesn't solve

This all runs locally on one machine. I don't know whether identity drift looks the same in hosted environments. If u run stateless agents behind an API, the problem might not exist for you.

Small project, small community, growing. The pattern itself is small enough to steal - three JSON files and a convention. But the system that keeps agents coherent at scale is where the real work went.

pip install aipass and two commands to get a working agent. The .trinity/ directory is the identity layer.

Has anyone else tried separating identity from memory in their agent setups? Curious whether the ordering matters in other architectures, or if it's just an artifact of how this system evolved.

reddit.com
u/Input-X — 25 days ago

I've been building this repo public since day one, roughly 7 weeks now with Claude Code. Here's where it's at. Feels good to be so close.

The short version: AIPass is a local CLI framework where AI agents have persistent identity, memory, and communication. They share the same filesystem, same project, same files - no sandboxes, no isolation. pip install aipass, run two commands, and your agent picks up where it left off tomorrow.

You don't need 11 agents to get value. One agent on one project with persistent memory is already a different experience. Come back the next day, say hi, and it knows what you were working on, what broke, what the plan was. No re-explaining. That alone is worth the install.

What I was actually trying to solve: AI already remembers things now - some setups are good, some are trash. That part's handled. What wasn't handled was me being the coordinator between multiple agents - copying context between tools, keeping track of who's doing what, manually dispatching work. I was the glue holding the workflow together. Most multi-agent frameworks run agents in parallel, but they isolate every agent in its own sandbox. One agent can't see what another just built. That's not a team.

That's a room full of people wearing headphones.

So the core idea: agents get identity files, session history, and collaboration patterns - three JSON files in a .trinity/ directory. Plain text, git diff-able, no database. But the real thing is they share the workspace. One agent sees what another just committed. They message each other through local mailboxes. Work as a team, or alone. Have just one agent helping you on a project, party plan, journal, hobby, school work, dev work - literally anything you can think of. Or go big, 50 agents building a rocketship to Mars lol. Sup Elon.

There's a command router (drone) so one command reaches any agent.

pip install aipass

aipass init

aipass init agent my-agent

cd my-agent

claude # codex or gemini too, mostly claude code tested rn

Where it's at now: 11 agents, 4,000+ tests, 400+ PRs (I know), automated quality checks across every branch. Works with Claude Code, Codex, and Gemini CLI. It's on PyPI. Tonight I created a fresh test project, spun up 3 agents, and had them test every service from a real user's perspective - email between agents, plan creation, memory writes, vector search, git commits. Most things just worked. The bugs I found were about the framework not monitoring external projects the same way it monitors itself. Exactly the kind of stuff you only catch by eating your own dogfood.

Recent addition I'm pretty happy with: watchdog. When you dispatch work to an agent, you used to just... hope it finished. Now watchdog monitors the agent's process and wakes you when it's done - whether it succeeded, crashed, or silently exited without finishing. It's the difference between babysitting your agents and actually trusting them to work while you do something else. 5 handlers, 130 tests, replaced a hacky bash one-liner.

Coming soon: an onboarding agent that walks new users through setup interactively - system checks, first agent creation, guided tour. It's feature-complete, just in final testing. Also working on automated README updates so agents keep their own docs current without being told.

I'm a solo dev but every PR is human-AI collaboration - the agents help build and maintain themselves. 105 sessions in and the framework is basically its own best test case.

reddit.com
u/Input-X — 28 days ago
▲ 38 r/aitoolbase+16 crossposts

I've been building this repo public since day one, roughly 7 weeks now with Claude Code. Here's where it's at. Feels good to be so close.

The short version: AIPass is a local CLI framework where AI agents have persistent identity, memory, and communication. They share the same filesystem, same project, same files - no sandboxes, no isolation. pip install aipass, run two commands, and your agent picks up where it left off tomorrow.

You don't need 11 agents to get value. One agent on one project with persistent memory is already a different experience. Come back the next day, say hi, and it knows what you were working on, what broke, what the plan was. No re-explaining. That alone is worth the install.

What I was actually trying to solve: AI already remembers things now - some setups are good, some are trash. That part's handled. What wasn't handled was me being the coordinator between multiple agents - copying context between tools, keeping track of who's doing what, manually dispatching work. I was the glue holding the workflow together. Most multi-agent frameworks run agents in parallel, but they isolate every agent in its own sandbox. One agent can't see what another just built. That's not a team.

That's a room full of people wearing headphones.

So the core idea: agents get identity files, session history, and collaboration patterns - three JSON files in a .trinity/ directory. Plain text, git diff-able, no database. But the real thing is they share the workspace. One agent sees what another just committed. They message each other through local mailboxes. Work as a team, or alone. Have just one agent helping you on a project, party plan, journal, hobby, school work, dev work - literally anything you can think of. Or go big, 50 agents building a rocketship to Mars lol. Sup Elon.

There's a command router (drone) so one command reaches any agent.

pip install aipass

aipass init

aipass init agent my-agent

cd my-agent

claude # codex or gemini too, mostly claude code tested rn

Where it's at now: 11 agents, 4,000+ tests, 400+ PRs (I know), automated quality checks across every branch. Works with Claude Code, Codex, and Gemini CLI. It's on PyPI. Tonight I created a fresh test project, spun up 3 agents, and had them test every service from a real user's perspective - email between agents, plan creation, memory writes, vector search, git commits. Most things just worked. The bugs I found were about the framework not monitoring external projects the same way it monitors itself. Exactly the kind of stuff you only catch by eating your own dogfood.

Recent addition I'm pretty happy with: watchdog. When you dispatch work to an agent, you used to just... hope it finished. Now watchdog monitors the agent's process and wakes you when it's done - whether it succeeded, crashed, or silently exited without finishing. It's the difference between babysitting your agents and actually trusting them to work while you do something else. 5 handlers, 130 tests, replaced a hacky bash one-liner.

Coming soon: an onboarding agent that walks new users through setup interactively - system checks, first agent creation, guided tour. It's feature-complete, just in final testing. Also working on automated README updates so agents keep their own docs current without being told.

I'm a solo dev but every PR is human-AI collaboration - the agents help build and maintain themselves. 105 sessions in and the framework is basically its own best test case.

https://github.com/AIOSAI/AIPass

u/Input-X — 19 days ago