u/Dios_Apolo

I’ve spent the last couple of days building a self-hosted inference governance proxy called Aegis Latent Core (https://github.com/JuanLunaIA/aegis-latent-core). The goal is to record a cryptographically signed chain of custody for every model request and response, alongside real-time token entropy forensics, without adding latency to the user.

To keep the proxy off-path, we hand the telemetry data to a background task that writes to storage. For distributed production environments, we implemented PostgreSQL (using asyncpg pools) and DynamoDB (via aioboto3).

But for small-to-medium edge deployments, I wanted a zero-dependency, zero-ops storage option. I settled on SQLite, but configured with write-ahead logging enabled (PRAGMA journal_mode=WAL). To avoid concurrent write locks and database is locked errors, I'm forcing Uvicorn to run with a single worker when SQLite is active, serializing all writes.

Here is my worry: I’m telling developers this setup is adequate for up to 10 million audit nodes. But I have this nagging feeling that under sudden bursts of high-concurrency client connections, even with WAL mode and off-path background tasks, we will hit a write bottleneck. Under heavy read loads (e.g., pulling compliance bundles while the LLM is streaming generations), will SQLite's single-writer limitation cause the background queue to back up and eventually run the system out of memory?

Is SQLite WAL with workers=1 a practical, low-overhead solution for edge workloads, or is it an architectural anti-pattern that I should replace with an embedded key-value store like RocksDB or LMDB?

The storage layer interface and SQLite implementation are here: https://github.com/JuanLunaIA/aegis-latent-core. I would love for some database engineers to tear our connection pooling and WAL checkpointing logic apart.

Is SQLite WAL with a single worker actually viable for edge MLOps audit logs, or am I setting myself up for corruption?