u/Accomplished_Bus1320

Your multi-tenant SaaS is one extra database engine away from breaking.

Your multi-tenant SaaS is one extra database engine away from breaking.

Most multi-tenant SaaS apps start with one database engine, usually Postgres. RLS works, tenant_id columns work, the architecture feels solid. Then six months in, a feature needs a document store, a cache, a different relational engine for a third-party integration. Suddenly you have multiple engines, each with its own tenant isolation problem, each with its own connection pool, each with its own silent-failure mode.

The standard answer is: pick one engine, force everything through it. The honest version: most real apps need more than one. AI apps need at least three. And the moment you add the second engine, your "we use RLS" answer to the compliance question stops working.

TenantsDB is an orchestration layer that gives every tenant their own physically isolated database across PostgreSQL, MySQL, MongoDB, and Redis, behind one proxy and one API.

A few things that fall out of building it polyglot from day one:

One tenant can use multiple engines. Postgres for transactions, MongoDB for documents, Redis for cache, all isolated per-tenant, all under the same tenant ID. One API call provisions all of them. Connection strings stay stable forever.

Schema-first across engines. You design a schema as a versioned blueprint in a workspace, then deploy it to every tenant database in one command. Add a column in dev, push it to 1,000 tenants. Same flow for Postgres tables, MySQL tables, Mongo collections.

OmniQL. One query language that compiles natively to all four engines. Write :GET User WHERE id = 42, the proxy compiles it to SQL for Postgres/MySQL, find() for Mongo, HGETALL for Redis. Open source. Optional, you can still use native drivers.

Two isolation tiers, switchable per-tenant. Tenants start on shared infrastructure (L1) for instant provisioning. When a customer goes enterprise or starts impacting others, one command migrates them to dedicated infrastructure (L2) using native logical replication (PG PUBLICATION/SUBSCRIPTION, MySQL binlog, MongoDB change streams). Sub-2-second cutover. Connection strings don't change.

Proxy-enforced settings. query_timeout_ms, max_rows_per_query, max_connections, blocked commands per engine. Enforced at the proxy, not in your app code. DDL is blocked on tenant connections so schema can only change through blueprint deployments. No drift.

Benchmark numbers, measured:

Engine Direct p50 Proxy p50 Overhead
PostgreSQL 0.82ms 2.23ms +1.41ms
MySQL 1.01ms 2.34ms +1.33ms
MongoDB 1.45ms 3.32ms +1.87ms
Redis 0.66ms 3.09ms +2.43ms

Zero errors across 2 million queries at 100 concurrent tenants.

Free for 5 tenants. tenantsdb.com

Isolated Tenant section screenshot below.

https://preview.redd.it/499mz83ehy1h1.png?width=2750&format=png&auto=webp&s=647fe0c89015523e18b66ef32f4639619c512cd8

Curious what others here are using for multi-engine multi-tenancy. Are you forcing everything through Postgres, running separate orchestration per engine, or building a custom routing layer?

reddit.com
u/Accomplished_Bus1320 — 3 days ago

Why your AI agent’s "memory" is a data breach waiting to happen.

We are all building AI agents with "memory" right now. It is super easy to get a single-tenant agent working locally. But the second we try to scale this into a multi-tenant SaaS, almost everyone takes the exact same shortcut.

We dump 10,000 users into one shared vector database (Pinecone, pgvector, etc.) and just slap a {"tenant_id": "123"} filter on the queries.

People call this "tenant isolation", but let's be real. It is just a WHERE clause.

Here is the terrifying part about AI. If a metadata filter drops or misfires in a normal SaaS app, the user usually just gets a blank dashboard or a 500 error. You notice it, you fix it.

But if that filter drops in an AI retrieval path? The bug is completely silent.

The vector search just pulls the nearest neighbors from the entire database. Your LLM silently ingests User A's proprietary docs or private chats, and confidently hallucinates those secrets straight into User B's answer. You just accidentally cross-pollinated your customers' private data.

This is why logical isolation (namespaces, RBAC, metadata tags) is a ticking time bomb for AI. All your security controls live inside the exact same bug radius as your application code.

If you are serving actual customers, the only way to actually guarantee zero data bleed is physical isolation. Every single user needs their own physically separate database environment. If a retrieval bug happens, the AI literally cannot read another tenant's data because it is simply not in the database it connected to.

I know managing 1,000 isolated databases sounds like a DevOps nightmare (Terraform sprawl, proxy routing, etc.), but the orchestration tooling actually exists now to make it manageable.

I am curious for anyone actually building AI agents in here. Are you physically isolating your vector stores per user? Or are you just praying your metadata filters never drop a clause?

reddit.com
u/Accomplished_Bus1320 — 5 days ago

The real reason most SaaS apps run on shared databases isn't architecture. It's operational overhead.

Multi-tenancy comes down to one tradeoff: how much isolation tolerance do you have?

On one end you accept zero data leakage and zero noisy-neighbor risk. Every tenant gets their own database, their own cache, their own everything. That's the cleanest answer for security questionnaires, regulated industries, and any customer who doesn't want their CPU stolen by the loudest neighbor on the box.

On the other end you put everything in one shared database with a tenant ID column on every table. Economies of scale win. Operational cost drops. You eat the noisy-neighbor risk and you trust your code to never leak a row.

Most SaaS pick the second option. Not because it's architecturally correct. Because the first option is operationally maddening at any real scale.

Think about what you actually sign up for with per-tenant infrastructure:

  • Provisioning a new database every time a customer signs up
  • Running every schema migration across N tenants instead of once
  • Upgrading database versions, cache layers, and storage engines N times
  • Backing up and restoring N independent systems
  • Troubleshooting issues that hit one tenant but not the others
  • Doing all of this in a way that still scales to thousands of tenants

That's the wall. It's not a database problem, it's an operational layer problem. And it's why most teams retreat to shared schemas even when they know it's the weaker isolation model.

We built TenantsDB to be that operational layer. One proxy across PostgreSQL, MySQL, MongoDB, and Redis. The model is workspace -> blueprint -> tenant. You design the schema in a workspace (a regular database you connect to with any client). The schema is versioned as a blueprint. Each tenant is a deployed instance of that blueprint, as a real isolated database. Add a column in the workspace, deploy to every tenant with one command. Migrations stop being scripts and start being deploys.

Two isolation levels. L1 shared host (cheap, instant provisioning, still its own database, never mixed data). L2 dedicated host (no noisy neighbors, guaranteed CPU and memory). Move tenants between them with one command. Cutover is roughly 2 seconds via native replication (PG logical, MySQL binlog, Mongo change streams).

Benchmark numbers (5-run medians, 80% reads, 20% writes, stock configs):

DB Direct p50 Proxy p50 Overhead Single-tenant QPS 100-tenant aggregate QPS Errors
PostgreSQL 0.82ms 2.23ms +1.41ms 2,039 3,926 0
MySQL 1.01ms 2.34ms +1.33ms 1,776 2,460 0
MongoDB 1.45ms 3.32ms +1.87ms 1,467 1,467 0
Redis 0.66ms 3.09ms +2.43ms 1,260 1,195 0

Zero errors across 2 million queries at 100 concurrent tenants. Under 9-tenant noisy-neighbor pressure (45 concurrent writers per noisy tenant), every engine held sub-17ms latency on the shared host. The dedicated tier removes neighbors entirely.

Free tier is real, no card required. Link: tenantsdb.com

Question: if you're on shared-schema today, is that actually the architecture you'd pick on a clean slate, or is it the choice you settled on because per-tenant looked operationally impossible?

u/Accomplished_Bus1320 — 9 days ago