u/Comfortable_Gas_3046

AICTX - building a local MCP server for repo-level continuity between coding-agent sessions

I’ve been working on a problem that I think many people using Codex / Claude / Copilot-style coding agents will recognize:

Every new session tends to rediscover the same repo structure, files, decisions, failed commands, current task state, and validation steps.

That wastes context, tokens, and time.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

I finally found it:

Operational continuity for AI coding agents.

AICTX that keeps useful execution state inside the repository, so the next coding-agent session can resume from what actually happened instead of inferring everything again from README + chat history.

Why MCP?

The CLI flow works, but relying only on repo instructions and shell commands still has a weakness:

the agent has to remember to run the right command at the right time.

So I’m now adding a local MCP server.

The goal is to expose AICTX continuity directly as MCP tools/resources/prompts, so compatible agents can discover and call the continuity layer natively.

The intended flow is:

agent starts task
  -&gt; calls aictx_resume
  -&gt; gets current repo continuity
  -&gt; works
  -&gt; calls aictx_finalize
  -&gt; continuity is persisted for the next session

The MCP server is local-first:

aictx mcp-server --repo . --profile full

No cloud memory.
No hidden sync.
No generic shell/filesystem server.
No daemon the user has to manage manually.

Compatible clients can launch it locally over stdio.

What the MCP exposes

The MCP server is meant to expose AICTX as repo-local continuity tools, not as a general machine-control layer.

Examples:

aictx_resume
aictx_finalize
aictx_view
aictx_map_query
aictx_task_update
aictx_portability_status

The idea is MCP-first, CLI fallback.

If MCP tools are available, the agent should call them.
If MCP is not available, the agent can still fall back to:

aictx resume --repo . --task "&lt;task goal&gt;" --json
aictx finalize --repo . --status success --summary "&lt;what changed&gt;" --json

What continuity means here

AICTX stores operational state inside .aictx/, such as:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task;
execution contracts;
continuity reports.

The point is not to dump more hidden memory into the prompt.

The point is to let the agent request the right operational context when it needs it.

Plugins

I’m also packaging Claude Code and Codex plugin artifacts around the same model:

MCP-first when available.
CLI fallback when not.

Copilot support is more best-effort for now through repo instructions and VS Code MCP config where supported.

A few parts I’m currently focusing on:

Execution Contracts

Each resume can include a compact contract for the next agent: first action, edit scope, canonical validation command, expected evidence, and finalize instruction. The goal is not only “remember context”, but guide the next execution safely.

Continuity View

One thing I’m also experimenting with a deterministic Mermaid continuity view generated from repo-local AICTX artifacts. It shows the current operational state of the repo visually: Work State, open handoffs, relevant failures, execution contracts, summaries, RepoMap hints, and portable continuity status.

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

The continuity lives with the repository. The idea is that useful operational state should not be locked inside one chat, one vendor, one local machine, or one agent tool. If the repo moves, the continuity can move with it ... if you want it to!

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
less instruction boilerplate once agents can call AICTX through MCP;
a cleaner path for Claude, Codex and Copilot integrations;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback from people building or using MCP servers, especially around:

tool boundaries;
local stdio server design;
how much schema/detail to expose;
avoiding tool/context bloat;
whether this kind of repo-level continuity belongs in MCP;
how to make the lifecycle reliable without over-automating it.

Collaborators and critical feedback welcome.

reddit.com

u/Comfortable_Gas_3046 — 16 hours ago

▲ 1 r/mcp

AICTX - building a local MCP server for repo-level continuity between coding-agent sessions

I’ve been working on a problem that I think many people using Codex / Claude / Copilot-style coding agents will recognize:

Every new session tends to rediscover the same repo structure, files, decisions, failed commands, current task state, and validation steps.

That wastes context, tokens, and time.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

I finally found it:

Operational continuity for AI coding agents.

Why MCP?

The CLI flow works, but relying only on repo instructions and shell commands still has a weakness:

the agent has to remember to run the right command at the right time.

So I’m now adding a local MCP server.

The goal is to expose AICTX continuity directly as MCP tools/resources/prompts, so compatible agents can discover and call the continuity layer natively.

The intended flow is:

agent starts task
  -&gt; calls aictx_resume
  -&gt; gets current repo continuity
  -&gt; works
  -&gt; calls aictx_finalize
  -&gt; continuity is persisted for the next session

The MCP server is local-first:

aictx mcp-server --repo . --profile full

No cloud memory.
No hidden sync.
No generic shell/filesystem server.
No daemon the user has to manage manually.

Compatible clients can launch it locally over stdio.

What the MCP exposes

The MCP server is meant to expose AICTX as repo-local continuity tools, not as a general machine-control layer.

Examples:

aictx_resume
aictx_finalize
aictx_view
aictx_map_query
aictx_task_update
aictx_portability_status

The idea is MCP-first, CLI fallback.

If MCP tools are available, the agent should call them.
If MCP is not available, the agent can still fall back to:

aictx resume --repo . --task "&lt;task goal&gt;" --json
aictx finalize --repo . --status success --summary "&lt;what changed&gt;" --json

What continuity means here

AICTX stores operational state inside .aictx/, such as:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task;
execution contracts;
continuity reports.

The point is not to dump more hidden memory into the prompt.

The point is to let the agent request the right operational context when it needs it.

Plugins

I’m also packaging Claude Code and Codex plugin artifacts around the same model:

MCP-first when available.
CLI fallback when not.

Copilot support is more best-effort for now through repo instructions and VS Code MCP config where supported.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
less instruction boilerplate once agents can call AICTX through MCP;
a cleaner path for Claude, Codex and Copilot integrations;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback from people building or using MCP servers, especially around:

tool boundaries;
local stdio server design;
how much schema/detail to expose;
avoiding tool/context bloat;
whether this kind of repo-level continuity belongs in MCP;
how to make the lifecycle reliable without over-automating it.

Collaborators and critical feedback welcome.

reddit.com

u/Comfortable_Gas_3046 — 17 hours ago

▲ 2 r/coolgithubprojects

AICTX - I am building a repo-local continuity runtime for coding agents and ... I think I finally got it right! The next agent starts from what actually happened, not from zero.

The problem was simple and well known by every coder / vibecoder: every new Codex / Claude / Copilot session kept rediscovering the same repo structure, files, decisions, failed commands, current task state, and validation steps, wasting context and tokens.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

I finally found it: Operational continuity for AI coding agents. I built an open-source (Python CLI) continuity runtime so agents don’t restart from zero every session, and it already made my own AI coding workflow feel much less like restarting from scratch every time.

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

I’m experimenting with a deterministic Mermaid continuity view generated from repo-local AICTX artifacts. It shows the current operational state of the repo visually: Work State, open handoffs, relevant failures, execution contracts, summaries, RepoMap hints, and portable continuity status.

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

MCP support

I'm also working on MCP support so compatible agents can access AICTX continuity directly as tools, resources and prompts instead of only relying on repo instructions and CLI commands.

The MCP server is local-first. It is not a cloud memory service, not a daemon you have to manage manually, and not a generic shell/filesystem server. Compatible agents launch it locally through stdio:

aictx mcp-server --repo . --profile full

I’m also packaging Claude Code and Codex plugin artifacts around the same model: MCP-first when available, CLI fallback when not. Copilot support remains best-effort through repo instructions and VS Code MCP config where supported.

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
less instruction boilerplate once agents can call AICTX through MCP;
a cleaner path for Claude, Codex and Copilot integrations;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome! It is still evolving!

u/Comfortable_Gas_3046 — 1 day ago

▲ 6 r/machinelearningnews

Experimenting with continuity, Ifinally got it right! The next agent starts from what actually happened, not from zero.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

MCP support

I'm also working on MCP support so compatible agents can access AICTX continuity directly as tools, resources and prompts instead of only relying on repo instructions and CLI commands.

aictx mcp-server --repo . --profile full

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
less instruction boilerplate once agents can call AICTX through MCP;
a cleaner path for Claude, Codex and Copilot integrations;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome! It is still evolving!

reddit.com

u/Comfortable_Gas_3046 — 1 day ago

▲ 1 r/Agentic_Marketing

I finally got it right! The next agent starts from what actually happened, not from zero.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome! It is still evolving!

reddit.com

u/Comfortable_Gas_3046 — 2 days ago

▲ 6 r/VibeCodersNest

I finally got it right! The next agent starts from what actually happened, not from zero.

I have been trying to work around that problem with different approaches: small handoffs, heavy memory systems, context engines...

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

MCP support

I'm also working on MCP support so compatible agents can access AICTX continuity directly as tools, resources and prompts instead of only relying on repo instructions and CLI commands.

aictx mcp-server --repo . --profile full

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
less instruction boilerplate once agents can call AICTX through MCP;
a cleaner path for Claude, Codex and Copilot integrations;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome! It is still evolving!

reddit.com

u/Comfortable_Gas_3046 — 2 days ago

▲ 1 r/IMadeThis

An inspectable repo-local continuity runtime for coding agents. Stop restarting from zero each session

The problem was simple: every new Codex / Claude / Copilot session kept rediscovering the same repo structure, files, decisions, failed commands, current task state, and validation steps, wasting context and tokens.

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

It is still evolving, but it already made my own AI coding workflow feel much less like restarting from scratch every time.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome!

reddit.com

u/Comfortable_Gas_3046 — 3 days ago

▲ 1 r/AIDeveloperNews

I built an open-source (Python CLI) continuity runtime for AI coding agents so they don’t restart from zero every session.

The continuity idea is not to add more hidden memory or dump more context into the prompt. AICTX keeps operational continuity inside the repo:

active Work State;
next actions;
decisions and handoffs;
failure memory;
validation evidence;
execution summaries;
repo context relevant to the current task.

The next agent should resume from what actually happened, not infer everything again from README + chat history.

A few parts I’m currently focusing on:

Execution Contracts

Continuity View

Here you can see what it looks like. The link to this view can be returned after each task, so the next session has an inspectable continuity map. I’m still working on making it easier to read.

Portability

Easy to use

pip install aictx
aictx install
aictx init

# Then keep using your coding agent normally, they will take care of use it

The medium-term benefit

Agent-based development starts to feel less like a sequence of isolated chats and more like an ongoing engineering process:

less rediscovery;
fewer repeated failed commands;
clearer handoffs;
better validation discipline;
easier switching between Codex, Claude, Copilot or other agents;
and a repo that can explain its current state to the next session.

It is still evolving, but it already made my own AI coding workflow feel much less like restarting from scratch every time.

GitHub: https://github.com/oldskultxo/aictx

Docs: https://aictx.org

I would love technical feedback, especially from people using coding agents across multiple sessions.

Collaborators welcome!

reddit.com

u/Comfortable_Gas_3046 — 3 days ago

▲ 6 r/AIAgentsInAction+2 crossposts

It is not only about memory or context, think about continuity

I’ve been experimenting with a repo-local continuity runtime for coding agents. Not another memory system, not a context engine

The problem I’m trying to solve is specifically the following:

Every new agent session still feels like onboarding a junior dev into the repo again.

It scans broad docs, rediscovers structure, repeats failed commands, loses unfinished work, and depends too much on chat history.

I want a veteran engineer used to work in my huge projects every session. Without rediscovering and understanding whole repo once and again. So that is why I started working on aictx.

aictx adds a small local runtime loop:


aictx resume --repo . --task "what I’m doing" --json

# agent works

aictx finalize --repo . --status success --summary "what happened" --json

The next session can start from repo-local facts:

active task state
previous handoff
decisions
known failures
successful strategies
optional RepoMap structural hints
contract/compliance gaps from the previous run

Latest thing I’ve been working on: git-portable continuity.

By default, .aictx stays local. But now you can opt in to a team-safe mode where a safe subset of continuity artifacts travels with the repo through Git — no cloud sync, no hosted memory, no hidden dashboard.

It keeps volatile stuff local:

metrics, logs, session identity, generated capsules, indexes.

And only exposes durable continuity:

handoffs, decisions, failure memory, strategy memory, task threads, semantic/area shards.

The goal is not to replace coding agents.

It’s to make the next session behave less like a stranger and more like someone who remembers the repo’s recent work.

Website: https://aictx.org

GitHub: https://github.com/oldskultxo/aictx

I’d love feedback from people using Codex, Claude Code, Copilot, Cursor, or similar tools across repeated sessions in the same repo.

u/Comfortable_Gas_3046 — 1 day ago

▲ 5 r/github

How long do GitHub impersonation reports usually take?

Hi!

I recently ran into a situation I had never dealt with before as an open-source maintainer, and I am wondering what to expect from GitHub Support.

I maintain a small open-source developer tool. Recently I found another project in the same niche using a very similar identity around the project name: GitHub namespace, package naming, CLI naming, domain, and general positioning. Almost same readme (copy/paste detected), same tool name. Same functionallity but using nmp instead pypi.

I am not going to name either project here because I do not want this to become a public pile-on or look like I am asking people to judge the other maintainer. I also do not want to assume bad faith publicly. But from a user perspective, the overlap is strong enough that someone searching for my project could reasonably think the other one is the official project, an affiliated package, or a new distribution channel.

I tried to contact the person behind it through LinkedIn first, but I was not able to reach them. Since I could not get a direct conversation going, I opened a GitHub impersonation / confusing namespace report and included the evidence: project name, package name, CLI name, domain, public dates, and the risk of user confusion.

In the meantime, I ended up buying the .org domain for my project, which at least fits an open-source project reasonably well. I also had to improve the website, GitHub Pages setup, sitemap, Search Console, README links, PyPI links, and SEO signals just to make the official identity clearer.

Honestly, it is frustrating (and sad). I expected to spend time improving the project, not defending whether users can even find the correct one.

For people who have dealt with this before:

How long do GitHub impersonation or confusing namespace reports usually take to get reviewed?

Is there anything else I should do while waiting, apart from keeping the public messaging neutral and strengthening the official project links?

Any help appreciated as I said, it is sad, mostly being a opensource project.

Thanks in advance

reddit.com

u/Comfortable_Gas_3046 — 6 days ago

▲ 1 r/PythonProjects2

Built a small open-source CLI to give coding agents repo-local continuity between sessions

The Story

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

I haven’t managed to feel that I’m working side by side with an engineer who knows the repository. Someone who is used to the project’s codebase, its strategies, its typical errors, the commands that should be run and the ones that shouldn’t.

I miss the feeling that the agent (I usually work with Codex and Claude, although mainly with Codex ) is a veteran teammate, not a rookie who has to review the whole repo, starting from the README and the Makefile, before writing a single line of code.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

Since I work intensively in large repositories, I saw a major limitation in Codex starting every session again from the README. It frustrated me to watch it rediscover the repo, try overly broad commands, or attempt to run huge test suites that had nothing to do with the task at hand.

So I started building a tool focused on operational continuity.

I called it AICTX.

My Approach

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

AICTX stores and reuses things like work state, handoffs, decisions, failure memory, strategy memory, execution summaries, RepoMap hints, execution contracts, and contract compliance signals.
All of them are auditable artifacts that are easy to inspect at repo level.

Runtime flow diagram

On the other hand, one of the things I like most about the tool is that I can enable portability and keep the most important continuity artifacts versioned, so I can continue the task on my personal laptop, my work laptop, or anywhere else.

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Testing It

I wanted to check whether this actually worked, not just rely on my own impressions while watching the agent work with AICTX. So I created a small Python demo repo and ran the same two-session task twice:

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

This is important: the demo is not designed under conditions where AICTX has the maximum possible advantage. The repository is small, the task is simple, and the continuation prompt without AICTX includes enough manual context.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at demo repository)

Session 2

Metric	with_aictx	without_aictx	Difference


Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference


Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

About Results

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

And I suspect this demo actually reduces the real size of the problem. In a large repo, where the previous session left decisions, failed attempts, scope boundaries, correct test commands, and known risks, continuity should matter more.

Final Thoughts

I still don’t fully get the feeling of continuity I’m looking for, but I’m starting to get closer. To push that feeling a bit further, AICTX makes the agent give operational-continuity feedback to the user through a startup banner at the beginning of each session and a summary output at the end of each execution.

Feedback example of a demo session

The tool is still evolving, and I’m still scaling it while trying to solve my own pains. I’d love to receive feedback: positive things, criticism, possible improvements, issues people notice, or even PRs if anyone feels like contributing.

If anyone wants to try it:

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 10 days ago

▲ 0 r/LLMDevs

Coding agents don’t need more context. They need continuity.

I’ve been working with coding agents for quite a while now. I’ve been a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

I’ve stopped thinking of coding agents as autocomplete. In many tasks, they can reason through codebases and produce solid implementations. But one thing still feels missing.

I haven’t managed to feel that I’m working side by side with an engineer who knows the repository. Someone familiar with the project’s codebase, its strategies, its typical errors, the commands that should be run and the ones that shouldn’t. A veteran teammate, not a rookie who has to review the whole repo, starting from the README and the Makefile, before writing a single line of code.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity. Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

Since I work intensively in large repositories, I saw a major limitation in Codex (the agent I use mainly) starting every session again from the README. It frustrated me to watch it rediscover the repo, try overly broad commands, or attempt to run huge test suites that had nothing to do with the task at hand.

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(Note: all demo metrics are available here)

Session 2

Metric	with_aictx	without_aictx	Difference

Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py first_edit_file = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md first_edit_file = src/taskflow/parser.py

With AICTX, the second session behaved more like an operational continuation. Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference

Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

AICTX produced a correct, more focused continuation with less repo rediscovery.
The execution without AICTX produced a broader solution, but it needed more exploration, more commands, more tests, and more time.

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

If anyone wants to try it:

Github repo

Pypi

  pipx install aictx
  aictx install
  cd repo_path
  aictx init
  # then just work with your coding agent as usual

With AICTX, I’m not trying to replace good prompts, skills, or already established memory/context-management tools. I’m simply trying to make operational continuity easier in large code repositories that I iterate on once and again.

I’d be really happy if it ends up being useful to someone along the way.

If you try it, I’d love to know whether it improves your workflow, or whether it gets in the way.

reddit.com

u/Comfortable_Gas_3046 — 12 days ago

▲ 3 r/devtools

Coding agents don’t need more context. They need continuity.

I’ve stopped thinking of coding agents as autocomplete. In many tasks, they can reason through codebases and produce solid implementations. But one thing still feels missing.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

>The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

>This is important: the demo is not designed under conditions where AICTX has the maximum possible advantage. The repository is small, the task is simple, and the continuation prompt without AICTX includes enough manual context.

Even so, in the second session a clear difference appeared.
(Note: all demo metrics are available here)

Session 2

Metric	with_aictx	without_aictx	Difference
Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py first_edit_file = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md first_edit_file = src/taskflow/parser.py

With AICTX, the second session behaved more like an operational continuation. Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference
Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

>Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

>The tool is still alive, and I’m still scaling it while trying to solve my own pains. I’d love to receive feedback: positive things, possible improvements, issues people notice, or even PRs if anyone feels like contributing.

If anyone wants to try it:

    pipx install aictx
    aictx install
    cd repo_path
    aictx init
    # then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

If you try it, I’d love to know whether it improves your workflow, or whether it gets in the way.

reddit.com

u/Comfortable_Gas_3046 — 13 days ago

▲ 3 r/SideProject

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference
Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference
Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session:

codex@aictx-demo-taskflow · session #2 · awake

Resuming: execute prompt .demo_metrics/with_aictx_v5/prompts/session_1.md.

Last progress: Added BLOCKED task support to summary output, updated parser/summary/CLI tests, fixed CLI subprocess PYTHONPATH handling, and wrote session_1_metrics.json.

Entry point: src/taskflow/models.py

─────────────────────────────────────────

AICTX summary

Context: loaded handoff/preferences/repo_map.

Map: RepoMap quick ok.

Saved: updated handoff.

Entry point: src/taskflow/models.py, src/taskflow/summary.py.

Details: last_execution_summary.md

The tool is still alive, and I’m still scaling it while trying to solve my own pains. I’d love to receive feedback: positive things, possible improvements, issues people notice, or even PRs if anyone feels like contributing.

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 14 days ago

▲ 4 r/coolgithubprojects+2 crossposts

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

>AICTX stores and reuses things like work state, handoffs, decisions, failure memory, strategy memory, execution summaries, RepoMap hints, execution contracts, and contract compliance signals.
All of them are auditable artifacts that are easy to inspect at repo level.

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference



Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

>With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference



Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session:

codex@aictx-demo-taskflow · session #2 · awake

Resuming: execute prompt .demo_metrics/with_aictx_v5/prompts/session_1.md.

Last progress: Added BLOCKED task support to summary output, updated parser/summary/CLI tests, fixed CLI subprocess PYTHONPATH handling, and wrote session_1_metrics.json.

Entry point: src/taskflow/models.py

─────────────────────────────────────────

AICTX summary

Context: loaded handoff/preferences/repo_map.

Map: RepoMap quick ok.

Saved: updated handoff.

Entry point: src/taskflow/models.py, src/taskflow/summary.py.

Details: last_execution_summary.md

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

u/Comfortable_Gas_3046 — 9 days ago

▲ 4 r/ContextEngineering+1 crossposts

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

https://preview.redd.it/4zyhlep2pizg1.png?width=1672&format=png&auto=webp&s=adfaab86e79312254153fa4a0b073138648a7bd4

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference



Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference



Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 15 days ago

▲ 1 r/AiBuilders

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

https://preview.redd.it/oa4ppf7gnizg1.png?width=1672&format=png&auto=webp&s=e9b4aa1d9473c99c93e3c679c61dfbe5f9f101a9

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference



Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference



Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 16 days ago

▲ 2 r/learnAIAgents

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

https://preview.redd.it/vwd5ga1vmizg1.png?width=1672&format=png&auto=webp&s=5cf1ae402ee42b78395c3dd210e972699b6549a4

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference


Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference


Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 16 days ago

▲ 1 r/AIAgentsInAction

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

https://preview.redd.it/9zmuugkplizg1.png?width=1672&format=png&auto=webp&s=47a7bbb9453474a0f860fc70f6cdde7f00fef7e3

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference

Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference

Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 16 days ago

▲ 1 r/Agentic_Marketing

I built a repo-local continuity layer for coding agents. It helps each new session behave like the same repo-native engineer continuing prior work

I’ve been working with coding agents for quite a while now.

I’ve been working as a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.

Now, honestly, I’m pretty much surrendered to the quality of the code and reasoning these agents can produce. Many times they are better programmers than me. I don’t have many doubts about that.

But there is still something I haven’t fully been able to feel.

At first I thought it was all about refining prompts.

Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.

I also had a “context” phase. I became obsessed with improving the context my agent was working with.

And yet I still had the same feeling.

The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was continuity.

Not chat memory.
Not a vector DB full of random chunks.
Something more human. Something closer to what a teammate would ask on their first day at work:

Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?

So I started building a tool focused on operational continuity.

I called it AICTX.

In one sentence: aictx is a repo-local continuity runtime for coding agents.

The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.

After many iterations, the workflow has consolidated into something like this:

user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity

https://preview.redd.it/b4k2w4zqldzg1.png?width=1672&format=png&auto=webp&s=5b73e87cef90464db35672e92afa65cef71543a8

The execution contract part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:

first_action
edit_scope
test_command
finalize_command
contract_strength

Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.

one branch using AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx);
one branch without AICTX (https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx).

The task was intentionally simple: add support for a new BLOCKED status, and then continue in a second session to validate parser edge cases.

Even so, in the second session a clear difference appeared.
(note: all demo metrics are available at https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics)

Session 2

Metric	with_aictx	without_aictx	Difference
Files explored	5	10	-50.0%
Files edited	1	3	-66.7%
Commands run	8	15	-46.7%
Tests run	1	4	-75.0%
Exploration steps before first edit	6	15	-60.0%
Time to complete	72s	119s	-39.5%
Total tokens	208,470	296,157	-29.6%
API reference cost	$0.5983	$0.8789	-31.9%

The most interesting difference for me was not the tokens. It was where the agent started.

With AICTX:

first_relevant_file = tests/test_parser.py
first_edit_file     = tests/test_parser.py

Without AICTX:

first_relevant_file = README.md
first_edit_file     = src/taskflow/parser.py

That is exactly what I wanted to measure.

With AICTX, the second session behaved more like an operational continuation.
Without AICTX, it behaved more like a new agent reconstructing the state of the project.

Across both sessions, the savings were more moderate:

Metric	with_aictx	without_aictx	Difference
Files explored	13	19	-31.6%
Commands run	19	26	-26.9%
Tests run	3	6	-50.0%
Time to complete	166s	222s	-25.2%
Total tokens	455,965	492,800	-7.5%
API reference cost	$1.3129	$1.4591	-10.0%

Honest result: AICTX did not magically win at everything.

In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.

There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”

The honest conclusion would be this:

For me, this fits the initial hypothesis quite well:

AICTX is not a magical token saver.
It has overhead in the first session.
Its value appears when work continues across sessions.
The real problem is not just “giving the model more context.”
The problem is making each agent session feel less like starting from zero.

Feedback example of a demo session

If anyone wants to try it:

Github repo: https://github.com/oldskultxo/aictx
Pypi: https://pypi.org/project/aictx/

pipx install aictx
aictx install
cd repo_path
aictx init

# then just work with your coding agent as usual

I’d be really happy if it ends up being useful to someone along the way.

reddit.com

u/Comfortable_Gas_3046 — 16 days ago