r/Clickhouse

▲ 27 r/Clickhouse+2 crossposts

Cerberus: A drop-in Prometheus, Loki & Tempo gateway for ClickHouse

Translate PromQL, LogQL, and TraceQL into optimized CH SQL — keep Grafana, swap the backend.

▲ 11 r/Clickhouse+1 crossposts

RFC: building a ClickHouse DevExperience platform

I’m developing a ClickHouse developer experience platform. In the same way Postgres underpins much of software development, ClickHouse is becoming the de facto choice for OLAP analytics, offering high‑performance queries out of the box.

Currently, working with ClickHouse is cumbersome: there are no built‑in APIs. My goal is to create “supabase” for ClickHouse, analogous to what Supabase provides for Postgres, that abstracts away these low‑level details.

The primary pain point I want to address is database transformation. Tools such as dbt and SQLMesh are powerful but require technical expertise. I aim to build a layer that lets users focus on their use cases rather than on implementation details. For example, users should not need to decide whether to create materialised views or tables; they should simply specify:

append‑only data
replace semantics
aggregation requirements
schema evolution
changes to the ORDER BY clause

Other challenges include:

Type ambiguity in queries: whether a key or field is an integer or a string. When users interact directly with ClickHouse, they must handle both cases or decide which columns and types to use.
RLS (row-level security) is not available.
API Authorization and authentication can be built.

These are some of the areas where I believe I can create an experience platform on top of ClickHouse.

I have been working on this for three weeks and expect another three weeks to complete a prototype. The idea was inspired by Tinybird, and I believe an open‑source alternative could fill a gap in the ClickHouse ecosystem. I would appreciate any feedback, suggestions for other problems that could be solved on top of ClickHouse, or interest in collaborating.

Ongoing work: https://github.com/gear6io/pragmata

u/piyushsingariya — 8 days ago

▲ 15 r/Clickhouse

We open-sourced a Drizzle-style schema + migration tool for ClickHouse (TypeScript)

We open-sourced chkit (MIT): it defines your ClickHouse tables, views and materialized views as TypeScript, diffs them against the live database, generates the migration SQL, and fails CI when prod drifts.

We built it after running ClickHouse at near-petabyte scale at our last company (Numia): hundreds of tables, a lot of materialized views, several environments, all managed with hand-written DDL and hope.

If you run ClickHouse in production, you've hit some version of these:

Someone runs a manual ALTER at 3am to clear an incident. It never makes it back into a migration file. Now your repo and your database disagree, and nothing tells you.
A one-line schema edit looks harmless but touches ORDER BY (or the engine, or PARTITION BY). ClickHouse has no in-place ALTER for those, so the only honest migration is create, copy, swap. You find out when a "quick change" starts rewriting a 2TB table in prod.
A migration drops a column. The reviewer misses it. It runs in CI on a Friday. The data is gone.

Postgres and MySQL have had this for years with Drizzle and Prisma: diff the schema, generate the migration, gate CI on drift. We wanted that for ClickHouse, couldn't find it, and built it.

You define your schema as TypeScript values, and chkit takes it from there:

chkit writes the migrations for you, and won't rewrite a 2TB table or let a DROP through CI without showing you first.
chkit drift compares the live DB to your schema, down to settings, TTL, ORDER BY and projections, and gates CI in one line.
Codegen turns the same schema into TypeScript row types and typed helpers to read and write rows, so your app and your database can't diverge. Optional, skip it if you only want migrations.

It's not an ORM. No query builder, you write your own SQL. Works with any ClickHouse (Cloud, Altinity, self-hosted, or managed), no lock-in. A Python port lands in a few weeks.

If you're already on ClickHouse, chkit can introspect your live DB and generate the schema files, so you start from what you're running instead of a blank file.

npm create chkit@latest

Beta: stable enough to run our own production workloads, with small breaking changes possible before 1.0.

Repo: https://github.com/obsessiondb/chkit

Docs: https://chkit.obsessiondb.com

If you run ClickHouse, I'm curious what you've had to build around it yourself, migrations or otherwise, and where the tooling still falls short.

PS: python port coming soon.

u/m0rcs — 11 days ago

▲ 37 r/Clickhouse+10 crossposts

Do you actually need Kafka between your OTel collector and ClickHouse?

Kafka → ClickHouse is the default pattern for OTel pipelines, and for org-wide streaming with replay and many consumers it's a great fit. But for a lot of single-sink observability setups, it's a cluster you're babysitting for no reason.

This post compares where the Kafka layer does real work vs. where you can drop it. It also checks what processing the Collector can or can't do alone (stateful dedup, enrichment-conditional filtering, dynamic sampling, etc.)
https://www.glassflow.dev/blog/opentelemetry-to-clickhouse-do-you-need-kafka?utm_source=reddit&utm_medium=socialmedia&utm_campaign=reddit_organic

Curious what others run:

Kafka buffer,
straight from the collector, or
a lighter processor in between

Leave your comments below, I'd like to discuss the options and understand what folks are using these days!

glassflow.dev

u/Marksfik — 14 days ago

▲ 8 r/Clickhouse+1 crossposts

Postgres and ClickHouse, and the future long-term plan?

Hey there! We have Postgres, already a replica of the operational transactional Postgres, and also ClickHouse. We are treating the replica Postgres as our analytics dwh and are running dbt in it. And our BI layer is connected to it.
We have events data stored in ClickHouse, but it is not in use at the moment.
Moving forward, what is my best long-term solution? I need to bring in the events data into our analytics dwh, so it becomes a natural decision point if we want to continuously commit to Postgres, or move analytics work and dbt over to ClickHouse, or explore other possibilities. We only want self-hosting options.
Thanks!

reddit.com

u/Novel-Information776 — 13 days ago

▲ 12 r/Clickhouse

I vibecoded ClickLens in case you want to perform deep-dives on your queries

I had to perform a deep-dive into a query recently to investigate why it was running slowly. It didn't take long before I got tired of running queries manually on a number of logging tables. That's why I decided to vibecode a tool for getting insights more conveniently. And now I've published an open-source tool specifically for this purpose: ClickLens. You can find more information here: https://github.com/nimbleflux/clickhouse-query-analyzer/

Let me know if you have any suggestions. It's a single stateless container that's easy to run locally or to deploy.

Edit: screenshots are slightly outdated, as I've since renamed the project to ClickLens.

u/bartcode — 12 days ago

▲ 19 r/Clickhouse

Why we rewrote WAL-G for Postgres backups in Rust: Meet WAL-RUS

clickhouse.com

u/saipeerdb — 11 days ago

Cerberus: A drop-in Prometheus, Loki &amp; Tempo gateway for ClickHouse