
dbt-colibri v0.3.4 : local column-level lineage for your dbt projects.
https://reddit.com/link/1thhk5f/video/ftit6fk3a22h1/player
(Disclosure: I'm the maintainer of dbt-colibri and also building the hosted version)
Hey /dataengineering,
Quick update on dbt-colibri; an open-source CLI tool that generates a static
HTML column-level lineage report from your dbt manifest + catalog.
Background, in case you haven't seen it: dbt core's native lineage is
table-level. dbt-colibri could replace dbt-docs for most teams; it runs locally, parses your project with SQLGlot, and outputs a single self-contained HTML file you can open, and host e.g. on GitHub Pages for your team.
It's been a while since the last time I posted anything about it, and some cool things have shipped;
- Redesigned UI & Improved search across models, columns, tags, code
- Shortcuts for quick navigation. (I especially like shift+number / number to open children/parents)
- Lineage graph should feel like a whiteboard, aligning nodes, selecting multiple nodes, hiding/showing nodes etc..
- Column lineage now follows columns through WHERE/JOIN clauses for more complete impact analysis.
- Ephemeral model column lineage is now supported (these are models without materialized tables/views, like a CTE but with a seperate dbt model)
- Exposures included in the graph.
- ~1.9x faster to parse large projects, using SQLGlot mypyc update, and optimizing how parser walks through large manifests
- Better warnings in the UI when manifest/catalog are incomplete and cause issues in column lineage
- New supported adapters, full is list now: Snowflake, BigQuery, Redshift, Postgres, DuckDB, Databricks (SQL models), Athena, Trino, SQL Server, ClickHouse, Oracle
- A lot of edge cases and teething issues related to column lineage got resolved with input from the community; Thank you!
Install:
pip install dbt-colibri
dbt compile && dbt docs generate # to generate catalog and dbt manifest
colibri generate
Repo: https://github.com/b-ned/dbt-colibri
Let me know if you find any bugs/edge cases where you see column lineage breaking; the goal is perfect column lineage.
Bas