u/River_Bass

Open source data governance compiler for PostgreSQL
▲ 18 r/dataengineering+1 crossposts

Open source data governance compiler for PostgreSQL

I never thought of data governance as a sexy topic. My main focus has always been on performance, insights, cost reduction.

That is, until I joined a startup as the sole data engineer. Dealing with tons of PII/PHI, I realized just how much effort it was to write all these custom tools to handle everything: infinite GRANTs, trigger functions for versioning, cron jobs for retention - and it all needed so much attention and maintenance. Or I could go with an off-the-shelf product that's a complete black-box with a learning curve.

Always one to prefer spending 10x longer automating the task than just doing it, I built a CLI tool that lets users build their DB/governance specs in declarative yamls, and writes all the SQL code for you. And it's open source, fully transparent, as secure as I could conceive of making it, and hopefully super user-friendly too.

I've linked the first release in my repo. Anyone want to try it out?

In the interest of transparency: I did code this with assistance from Claude, but I've been in data engineering for almost 20 years and manually debugged every line. I also got it to build me a suite of over 300 tests that run through GitHub Actions automatically on each commit.

github.com
u/River_Bass — 1 day ago