u/Just_Vugg_PolyMCP — reddlx

agentmw — Lightweight middleware for reliable, context-efficient AI agents (open source)

Hi everyone,
I’ve open-sourced agentmw, a framework-agnostic middleware that sits between your LLM client and agent logic to make agents more reliable on long runs.
Key features:
• Real-time failure detection (loops, redundant calls, contradictions, hallucinations)
• Smart context compression (keeps recent tool results, drops stale stuff)
• Persistent reasoning library (SQLite + embeddings) that learns reusable patterns across sessions
• Time-travel debugging CLI
• Works with any provider (OpenAI, Anthropic, Ollama, etc.) and any agent framework
• Async, circuit breaker, MCP server support, TOML config

Demo: pip install -e '.[all]' && agentmw demo
It’s still early but already helping me keep agents from spiraling and wasting tokens. Would love honest feedback, bug reports, or ideas for additional middleware features the community would find useful.
Thanks!

reddit.com

u/Just_Vugg_PolyMCP — 3 days ago

▲ 6 r/modelcontextprotocol+5 crossposts

Agentmw: Open-source middleware for AI agents — catches mid-run failures,compresses stale context, and grows a reasoning library across runs. Any model, any framework.

Hi everyone, I created this project based on my own needs for using agents. It often happens that the agent loops randomly, and I wanted to create my own method, which I must say is quite interesting to see in action. Any feedback is welcome!

github.com

u/Just_Vugg_PolyMCP — 3 days ago

▲ 0 r/golang

I built Gutenberg CLI: generate verified agent tools from OpenAPI, HAR, GraphQL or curl

I’ve been working on **Gutenberg CLI**, an open-source tool for turning API surfaces into usable local tools for AI agents.

The idea is simple: most agent tooling still relies on hand-written glue code. Gutenberg takes an API spec or capture and generates a complete tool surface that agents can actually use.

It can generate:

* a Go CLI
* an MCP server
* agent skills
* SQLite/FTS cache support
* safety policies
* verification/proof artifacts

Inputs include OpenAPI, HAR captures, GraphQL, Postman/Insomnia exports, JSON endpoints, and curl-style workflows.

The part I care about most is verification. Generated tools are checked with build, CLI smoke tests, MCP handshake, and Go tests before being treated as usable. The repo also includes a catalog of verified examples and generated tools.

This is not meant to be “just generate some code”. The goal is to generate tools that are safe enough and predictable enough for agents to call locally.

I’d love feedback from people building MCP servers, agent frameworks, internal tools, or API automation systems. Especially curious whether verification proofs and dry-run/default safety policies feel useful, or overkill.

reddit.com

u/Just_Vugg_PolyMCP — 4 days ago

▲ 11 r/modelcontextprotocol+4 crossposts

I built Gutenberg CLI: generate verified agent tools from OpenAPI, HAR, GraphQL or curl

I’ve been working on Gutenberg CLI, an open-source tool for turning API surfaces into usable local tools for AI agents.

The idea is simple: most agent tooling still relies on hand-written glue code. Gutenberg takes an API spec or capture and generates a complete tool surface that agents can actually use.

It can generate:

a Go CLI
an MCP server
agent skills
SQLite/FTS cache support
safety policies
verification/proof artifacts

Inputs include OpenAPI, HAR captures, GraphQL, Postman/Insomnia exports, JSON endpoints, and curl-style workflows.

This is not meant to be “just generate some code”. The goal is to generate tools that are safe enough and predictable enough for agents to call locally.

github.com

u/Just_Vugg_PolyMCP — 4 days ago

▲ 3 r/foss+3 crossposts

When AI agents keep repeating the same mistakes

If you’re building or running AI agents that handle real tasks — whether customer support, personal automation, internal company workflows, research, or operational work — you’ve likely run into the same issue.
The agent gets corrected by a human on a bad decision, an inaccurate statement, a wrong process, or an overcommitment. The next time, it makes the same error again. Vector memory and long context help with recall, but they rarely turn actual episodes (actions, outcomes, feedback, corrections) into structured, enforceable knowledge that prevents future mistakes.
Praxos addresses this directly. It acts as a lightweight experience layer for agents: a flight recorder that captures what happened, why it mattered, and what should be learned. It turns those episodes into:
• Reusable lessons
• Policies that can warn or block risky actions
• Evidence-backed records with sources and confidence
• Relevant context for future decisions
Example:
An agent is about to promise a specific delivery date or outcome in a response. Praxos matches it against a past case where a human had to correct a similar overcommitment. It triggers a block with the previous evidence before the output goes anywhere.
This isn’t limited to support. The same mechanism can help agents in personal task management, internal operations, research assistance, content workflows, or any scenario where repeated errors are costly or frustrating.
Technically it’s designed to be practical:
• Lightweight SQLite ledger
• Straightforward CLI (praxos record, praxos check, praxos policy add, etc.)
• Simple Python SDK
• Native MCP server for integration with tools like Claude and Cursor
• Human review queue for automatically generated lessons
• Hybrid matching that doesn’t require heavy dependencies

It’s early but already functional, and focused on a real gap: helping agents learn from experience instead of looping through the same failures.
If you’re working with AI agents in any context and dealing with this “same mistake again” problem, how are you handling operational memory and learning today? Have you tried memory graphs, persistent workflows, manual reviews, or other approaches? What still feels missing?
Interested in your experiences.

github.com

u/Just_Vugg_PolyMCP — 4 days ago

▲ 12 r/modelcontextprotocol+8 crossposts

I’ve been experimenting with making MCP tools feel more Unix-native

There are already some interesting projects around MCP tooling and conversion layers like mcporter and similar libraries.
While trying them, I realized what I personally missed wasn’t just “wrapping” MCP servers, but having an environment where:
MCP tools become normal CLIs
they work naturally with pipes/scripts/CI
agents can use them without loading huge schemas every session
and you can also create your own CLI tools directly from Python code
So I started building cli-use.

Example:

cli-use add fs /tmp
cli-use fs list_directory --path /tmp

After that the MCP server behaves like a regular Unix command:

cli-use fs search_files --path /tmp --pattern "*.md" | head

also added things like:
daemon mode for fast repeated calls
caching
shell completions
automatic SKILL.md generation for agents

One thing I found interesting is that reducing all the MCP protocol overhead ended up saving a pretty large amount of tokens during agent workflows.
Still experimenting with the idea, but I’m curious whether other people working with MCP also want a more shell-native / Unix-style approach to tools.

github.com

u/Just_Vugg_PolyMCP — 9 days ago

▲ 7 r/parma

Ciao 👋

Ho creato questo progetto: https://www.cercoaparma.it/

L’idea è partita da una cosa banale: ogni volta che cerco un servizio locale (tipo idraulico, elettricista, negozio ecc.) finisco sempre su Google a scavare tra risultati poco chiari, pieni di pubblicità o non aggiornati.

Quindi ho provato a fare qualcosa di più semplice: un sito che raccoglie attività e professionisti della zona di Parma, organizzati in modo diretto e senza troppe distrazioni.

È ancora all’inizio, quindi niente di perfetto e poche attività— però mi interessava capire se può avere senso anche per altri.

Se vi va, datemi un parere sincero:
- lo usereste davvero?
- cosa vi farebbe preferire questo rispetto a Google?
- cosa manca secondo voi?

Feedback onesti (anche negativi) super apprezzati 🙏

u/Just_Vugg_PolyMCP — 22 days ago

▲ 2 r/modelcontextprotocol+2 crossposts

The Model Context Protocol (MCP) is quickly becoming the standard way for AI agents (Claude, Cursor, ChatGPT, etc.) to connect to real tools and systems.

The problem? Making your own software available to these agents has usually been quite annoying — lots of custom wrappers and boilerplate.

PolyMCP makes it dead simple.

It’s a universal toolkit that lets you turn your existing Python or TypeScript software into proper MCP servers with almost zero effort.

Quick example (Python):

from polymcp import expose_tools_http

def create_support_ticket(user_email: str, description: str, priority: str):

"""Create a ticket in our internal support system"""

...

def get_order_status(order_id: str):

"""Check real-time order status from our database"""

...

def generate_sales_report(region: str, period: str):

"""Pull sales data and generate report"""

...

app = expose_tools_http(

tools=[create_support_ticket, get_order_status, generate_sales_report],

title="Acme Internal Tools",

description="Core business systems for AI agents"

)

The TypeScript version is equally straightforward.

Once you run it, any MCP-compatible agent can automatically discover and use your tools.

PolyMCP also includes:

UnifiedPolyAgent → orchestrate multiple MCP servers with any LLM (OpenAI, Anthropic, Ollama…)

PolyClaw → safe Docker-based autonomous agent for real workflows

Nice Inspector UI for testing tools and running agents

Skills system + CLI tools

In short: it lets you bring your actual software into the MCP ecosystem so AI agents can work with your real systems, not just demo functions.

Repo: https://github.com/poly-mcp/PolyMCP

What internal tools or software are you planning to make available to AI agents?

Would love to hear your thoughts or use cases!

reddit.com

u/Just_Vugg_PolyMCP — 23 days ago

▲ 7 r/foss+3 crossposts

Hey everyone,

I just open-sourced TuneForge.

The goal is simple: let your coding agent manage the full LLM improvement loop without ever leaving the chat window.

You can now tell your agent something like:

“Build me a customer support bot from this FAQ”

…and it can:

• Generate a clean synthetic instruction dataset (with LLM judging for quality)

• Run LoRA supervised fine-tuning on any Hugging Face causal LM

• Do a quick policy-gradient RL step using Ollama as the reward judge

• Merge the adapter, evaluate on a test set, and iterate

Everything runs locally, uses 4-bit quantization so it fits on modest hardware, and uses background jobs (with job_id polling) so long training tasks don’t freeze the MCP connection.

It’s built around the Model Context Protocol (MCP) for seamless integration with Claude Desktop, Cursor, Zed, Continue.dev, etc.

Tech: Python + Transformers + PEFT + bitsandbytes + Ollama + SQLite for job state.

Super early stage (just released), MIT licensed.

Would love feedback or ideas on what to add next. If you’re into agentic fine-tuning workflows, give it a try and let me know how it goes!

u/Just_Vugg_PolyMCP — 25 days ago