u/Few-Carry-2850

▲ 4 r/DataBuildTool+1 crossposts

issues with auto increment columns (dbt+snowflake)

I’m new to dbt and looking for some guidance on handling SCD loads in a medallion architecture. 

Our setup looks like this: 

  • Landing 
  • Bronze layer (truncate and load) 
  • Silver layer (enriched layer with SCD processing) 
  • Gold layer (only active/current records)

 

In the Silver layer, we’re loading data using: 

  • an incremental ID column 
  • another hash column based on the ID

 

The initial load works fine, but during incremental loads we’re running into issues such as: 

  • duplicate ID values 
  • intermittent load failures 
  • inconsistent data during merges

 

I’m trying to understand the best practice for handling auto-increment/surrogate keys and hash columns in SCD implementations with dbt, especially for incremental models. 

Has anyone faced a similar issue or can suggest a recommended approach? 

reddit.com
u/Few-Carry-2850 — 7 days ago