Hello fellow people from r/databasedevelopment, after many months of late-night experiments, I'm happy to share with you the first version of HedgeDB, a high performance and persisted Key Value store, (freely) inspired from RocksDB!

The project was born because as I was working with RocksDB, I grew a bit unhappy with its code bloat, and it has a hard time keeping up with modern NVMe device. So I decided to give it a try reinventing the wheel.

Here is the repo on GitHub, and also I spent some time preparing the hedgedb.github.io a few articles about architecture design trade-offs, and also it includes a performance comparison between HedgeDB and RocksDB (hopefully the bundled benchmark is "standard enough").

Features and core design

HedgeDB is an LSM-Tree engine designed to saturate the NVMe device. Inspired by RocksDB, the engine targets write-heavy workloads with uniformly-distributed keys (UUIDs, hashes), and is structured around:

Asynchronous execution. io_uring + C++20 coroutines via TooManyCooks, a fast work-stealing coroutine threadpool.
Partitioned LSM-tree. The key space is sharded into 2^N independent partitions (default 16). Compactions on different partitions run fully in parallel.
Size-tiered compaction. Lower write amplification than leveled, with a quotient filter on the read path to skip SSTs that can't contain a key.
Per-thread WAL. Each writer thread owns its own WAL file, so inode contention is avoided.
Direct I/O. O_DIRECT everywhere on the SST path: predictable latencies and transparent memory usage, avoiding IO stalls from page-cache pressure.
MVCC. Snapshot isolation over range scans.

Before you ask, this is not some auto-generated AI slop. I did leverage coding agents or chatbots for research, prototyping or testing support and help with proving correctness of some sections; but generating code was always followed by a phase of heavy manual refinement and refactor.

I hope you will find it interesting!

If you're interested in the project/wanna know more/need anything we can keep in touch on the Discord channel!

u/IlPresidente995

The case for Direct I/O - why it matters for high performance storage

I built HedgeDB, a high-performance and persisted Key Value store

Features and core design