r/Temporal

I built an open-source, self-hosted AI gateway: 237 providers (90+ free), auto-fallback combos, and a 10-engine token-compression pipeline (MIT)
▲ 326 r/Temporal+69 crossposts

I built an open-source, self-hosted AI gateway: 237 providers (90+ free), auto-fallback combos, and a 10-engine token-compression pipeline (MIT)

Builders-welcome post with the substance up front (disclosure: I'm the maintainer). OmniRoute is a free, MIT, self-hosted AI gateway — one OpenAI-compatible endpoint over 237 providers — built around two problems: runs dying on a provider 429, and tokens bleeding on tool/log output.

One endpoint, 237 providers — 90+ of them free. You point any tool or agent at a single OpenAI-compatible endpoint (localhost:20128/v1) and it can reach 237 LLM providers without you rewriting anything. 90+ have free tiers and 11 are free forever (no card), which aggregates to ~1.6B documented free tokens/month — and that's honest, pool-deduped math (we count each shared pool once instead of inflating it; the methodology is public in the repo). There's a one-command setup-* for 13+ coding tools (Claude Code, Codex, Cursor, Cline, Roo, Kilo, Gemini CLI…), so switching your existing setup over takes seconds.

Fallback combos — so it never stops mid-task. A "combo" is a ladder of models the router walks automatically: your subscription first, then API keys, then cheap models, then free ones. When a provider returns a 500 or you hit a rate limit, it slides to the next target in milliseconds, mid-request, and your tool never even sees the error. There are 17 routing strategies (priority, weighted, round-robin, cost-optimized, auto/coding:fast…) plus three resilience layers — a per-provider circuit breaker, a per-key cooldown, and a per-model lockout — so one dead key can't take down a whole provider.

Fusion — an ensemble mode for the hard steps. Beyond simple routing, there's a fusion strategy that fans a single prompt out to a panel of different models in parallel and then has a judge model synthesize one best answer (mixture-of-agents, built in). It's cost-aware, so easy turns stay on one fast model and it only fuses when the step is worth it.

A 10-engine compression pipeline — the part most routers don't have. Every request flows through a transparent compression pass you can toggle/stack per combo. Instead of one trick, it stacks the best of the open-source ecosystem: RTK filters command/tool output (git diffs, test logs, builds) at 60–90%, Microsoft's LLMLingua-2 does ML semantic pruning, Caveman handles prose, session-dedup strips repeats across turns. Critically, code, URLs and JSON are preserved byte-perfect, and a default-on inflation guard throws the compressed version away and sends the original if compressing would actually grow the prompt — it never makes things worse. On tool-heavy sessions that's ~89% average input-token reduction (an 8k-token git diff becomes a few hundred). Full credit to every upstream project (RTK, Caveman, LLMLingua-2, Troglodita) is in the README.

Agent-native — the agent can drive the router itself. There's a built-in MCP server (95 tools across 30 audited scopes, over stdio / SSE / streamable-HTTP), plus A2A (v0.3, JSON-RPC 2.0) support. That means an agent can query providers, switch combos, read its own remaining quota and manage memory through the gateway — not just consume tokens through it.

It's 100% local (zero telemetry, AES-256-GCM at rest), MIT-licensed, has a prompt-injection guard on every LLM route, opt-in memory, and runs on npm, Docker, desktop or your phone via Termux.

For context on whether it's worth your time: it's grown to ~9.8K GitHub stars, 1,490+ forks and 280+ contributors in ~4.5 months, with 21,000+ automated tests and 1,830+ issues closed — so it's a battle-tested project, not a brand-new experiment.

npm install -g omniroute

GitHub: https://github.com/diegosouzapw/OmniRoute · Site: https://omniroute.online

Would value a critique of the routing/compression architecture from this crowd.

u/ZombieGold5145 — 2 days ago

Workflow Builder - open-source, embeddable React SDK for visual workflow editors (runs on Temporal)

Hello everyone!  

I wanted to share something we’ve been building: Workflow Builder. It’s an open-source, embeddable React SDK that adds a visual workflow editor to your app. A canvas where people drag nodes, connect them, and build a workflow. The diagram they end up with is just data.

The reason I’m posting it here is that the data runs on Temporal. We ship a reference backend that takes a diagram and executes it as a Temporal workflow, so the thing your users draw in the browser is the thing that actually runs. The graph runner is deterministic and replay-safe inside the Temporal sandbox.

So if you’re on Temporal and have ever wanted to give people a visual way to configure or inspect the workflows you execute, that’s the gap this fills. The SDK is engine-agnostic underneath, but Temporal is the default integration we built and the one we run ourselves.

The fastest way to get what it is, is to open the demo and play with it for a couple of minutes (links at the bottom). But here’s the short tour.

You embed it with one npm package and one component:

npm install @workflowbuilder/sdk

Nodes and their settings forms are configured with JSON. A node definition looks roughly like this, and the JSON Schema renders the properties sidebar automatically (via JSON Forms):

{
"type": "send-email",
"icon": "Mail",
"label": "Send Email",
"schema": {
  "type": "object",
  "properties": {
    "to":      { "type": "string" },
    "subject": { "type": "string" },
    "retries": { "type": "number" }
    }
  }
}

How the pieces fit together. The SDK runs in the browser, the backend persists the diagram and streams status, and on Temporal the graph runner is the workflow: it walks the graph and schedules each node as an activity.

A few things worth calling out:

  • Data-driven palette + JSON Forms property panels — add a node type or reshape its form by editing JSON, no editor code.
  • Plugin system to extend the editor without forking. Two productivity plugins ship open source: undo/redo and copy/paste.
  • Temporal execution backend — topological runner, per-node error policies (fail / continue / error-route), live status streaming over SSE.
  • Persistence your way — localStorage, your REST API, or a save callback.
  • Theming via design tokens, light/dark, and i18n out of the box.

It’s Apache-2.0 with a live demo + docs. Happy to answer questions, and feedback.

Cheers!

u/Gullible_Emotion3068 — 12 days ago