u/FanFar9578

dbt-colibri v0.3.4 : local column-level lineage for your dbt projects.

dbt-colibri v0.3.4 : local column-level lineage for your dbt projects.

https://reddit.com/link/1thhk5f/video/ftit6fk3a22h1/player

(Disclosure: I'm the maintainer of dbt-colibri and also building the hosted version)

Hey /dataengineering,

Quick update on dbt-colibri; an open-source CLI tool that generates a static
HTML column-level lineage report from your dbt manifest + catalog.

Background, in case you haven't seen it: dbt core's native lineage is
table-level. dbt-colibri could replace dbt-docs for most teams; it runs locally, parses your project with SQLGlot, and outputs a single self-contained HTML file you can open, and host e.g. on GitHub Pages for your team.

It's been a while since the last time I posted anything about it, and some cool things have shipped;

  • Redesigned UI & Improved search across models, columns, tags, code
  • Shortcuts for quick navigation. (I especially like shift+number / number to open children/parents)
  • Lineage graph should feel like a whiteboard, aligning nodes, selecting multiple nodes, hiding/showing nodes etc..
  • Column lineage now follows columns through WHERE/JOIN clauses for more complete impact analysis.
  • Ephemeral model column lineage is now supported (these are models without materialized tables/views, like a CTE but with a seperate dbt model)
  • Exposures included in the graph.
  • ~1.9x faster to parse large projects, using SQLGlot mypyc update, and optimizing how parser walks through large manifests
  • Better warnings in the UI when manifest/catalog are incomplete and cause issues in column lineage
  • New supported adapters, full is list now: Snowflake, BigQuery, Redshift, Postgres, DuckDB, Databricks (SQL models), Athena, Trino, SQL Server, ClickHouse, Oracle
  • A lot of edge cases and teething issues related to column lineage got resolved with input from the community; Thank you!

Install:

pip install dbt-colibri
dbt compile && dbt docs generate # to generate catalog and dbt manifest
colibri generate

Repo: https://github.com/b-ned/dbt-colibri

Let me know if you find any bugs/edge cases where you see column lineage breaking; the goal is perfect column lineage.

Bas

reddit.com
u/FanFar9578 — 4 days ago