
pg-status 2.1.0 — HTTP discovery for PostgreSQL streaming replication, now with read-your-writes
Hi r/PostgreSQL!
I've been working on pg-status, a tiny C microservice that polls your PostgreSQL hosts and exposes their status over HTTP — answers questions like "who's the primary?", "which replica is lagging less than 100 ms?", "which replica has already replayed this specific LSN?".
Wrote about version 1.6.1 here; 2.1.0 is out and the framing of what it's good at became sharper, so I wanted to share an update.
TL;DR — what it is: a sidecar that lives next to your app, polls a static list of PG hosts in the background, and answers HTTP requests in sub-millisecond time. It is not a SQL proxy — your app still connects to Postgres directly, pg-status just tells it which host.
The headline feature: read-your-writes via min_lsn
This is the thing I'd ask you to look at even if you ignore the rest.
After a write to the primary, capture pg_current_wal_lsn() (returns something like 0/3000060). Pass it to pg-status as a query param:
GET /replica?min_lsn=0/3000060
pg-status returns a replica that has provably replayed up to that LSN. If none has, it returns the primary as fallback. You compose this with lag_ms/lag_bytes:
GET /replica?min_lsn=0/3000060&lag_ms=100
This is real read-your-writes:
On the application's side: catch the LSN immediately after write (pg_current_wal_lsn()) and drag it to the next read — through session/cookie/header or Redis, if write and read occur on different nodes of the application. This is the same job as any other read-your-writes approach.
What pg-status does: it keeps fresh replica positions in memory from background polling. When reading, the application makes a single HTTP call instead of round trips to each replica with pg_last_wal_replay_lsn() — and gets the name of the host that has successfully rolled. As far as I know, neither pgpool-II, HAProxy, nor the Patroni REST API have this particular lookup primitive.
What's new
min_lsnquery param (above)- New endpoint
/most_sync_by_bytes— deterministic pick of the most-current replica - Per-request lag thresholds:
?lag_ms=&lag_bytes=. Although, as before, you can set global thresholds through environment variables. max_fails/possible_dead— host marked dead only after N consecutive fails, but routing immediately avoidspossible_deadprimaries if a healthier one exists- Concurrent non-blocking polling of all hosts through a single
poll()syscall (was sequential before — a slow host blocked the rest)
Limitations
MAX_HOSTS = 10is a compile-time cap. If you hit it, please open an issue, easy to bump- Streaming replication only
- Static host list — adding hosts means restart
- No split-brain quorum. First alive master in
pg_status__hostswins.
Numbers
- 9 MiB RSS
- 1600–2000 RPS on 0.1 CPU; 8600–9000 RPS on 1 CPU
- Fast enough to call on every request
Try it
GitHub: https://github.com/krylosov-aa/pg-status
I will be very grateful if you put a star. Issues and comments are all welcome as well.
Thanks for reading!