u/Think-Strain-6274

▲ 1 r/SQL

Dimensional Modeling: Handling mixed granularity and broken hierarchies between Ad Platforms and Web Analytics (GA4)

Hi everyone,

I’m currently building a Data Warehouse (PostgreSQL) to consolidate marketing data, and I'm facing an architectural dilemma regarding dimensional hierarchies.

The Setup:

I’m extracting performance data from Google Ads and Meta Ads. I built a Snowflake-like schema with strict 1:N relati

reddit.com
u/Think-Strain-6274 — 24 hours ago

Dimensional Modeling: Handling mixed granularity and broken hierarchies between Ad Platforms and Web Analytics (GA4)

Hi everyone,

I’m currently building a Data Warehouse (PostgreSQL) to consolidate marketing data, and I'm facing an architectural dilemma regarding dimensional hierarchies.

The Setup:

I’m extracting performance data from Google Ads and Meta Ads. I built a Snowflake-like schema with strict 1:N relationships to enforce data integrity:

dim_ad_group (N:1) -> dim_campaign (N:1) -> dim_channel

For the ad platforms, this strict hierarchy works perfectly. A specific Ad Group belongs to exactly one Campaign, and a Campaign belongs to exactly one Channel (e.g., "Paid Social" or "Paid Search").

The Problem:

I am now integrating Google Analytics (GA4) traffic data into a new fact table (fact_web_traffic). GA4 data introduces mixed granularity and missing attributes. A lot of traffic comes in as (not set) for Ad Groups or Campaigns (e.g., Organic Search, Direct, Email, or Performance Max campaigns).

My dilemma with the solutions:

Using NULLs in the Fact Table: I could leave the campaign_id and ad_group_id as NULL in the fact table for non-paid traffic. However, this feels not professional

Using a Default "Dummy" Member (e.g., ID = -1): If I create a single (not set) dummy record in dim_campaign, I break the 1:N hierarchy because that single dummy campaign would need to map to multiple channels (Organic, Direct, Email) simultaneously, which my schema doesn't allow.

What is the industry standard / best practice to resolve this?

Should I generate multiple dummy records (one for each non-paid channel)? Or is there a completely different design pattern for merging strict Ad hierarchies with fluid Web Analytics data?

Thanks in advance!

reddit.com
u/Think-Strain-6274 — 24 hours ago