How to Implement Slowly Changing Dimensions in Streaming
Slowly Changing Dimensions (SCDs) track how dimensional data (customer attributes, product categories, pricing tiers) changes over time. In streaming, SCDs are handled naturally through CDC — every change is captured as an event, and streaming SQL can maintain both current and historical state.
SCD Types in Streaming
| SCD Type | Behavior | Streaming Implementation |
| Type 1 | Overwrite (no history) | CDC table = always current |
| Type 2 | Add new row (keep history) | Event stream + window function |
| Type 3 | Add columns (limited history) | CDC table with previous_value column |
Type 1 (Current State Only)
-- CDC table automatically reflects current state
CREATE TABLE customers (...) FROM pg_source TABLE 'public.customers';
-- Always current, no history
SELECT * FROM customers WHERE customer_id = 123;
Type 2 (Full History)
-- Track all changes with effective dates
CREATE MATERIALIZED VIEW customer_history AS
SELECT customer_id, name, plan, updated_at as effective_from,
LEAD(updated_at) OVER (PARTITION BY customer_id ORDER BY updated_at) as effective_to,
CASE WHEN LEAD(updated_at) OVER (PARTITION BY customer_id ORDER BY updated_at) IS NULL
THEN true ELSE false END as is_current
FROM customer_changes;
Frequently Asked Questions
Does streaming handle SCD Type 2 well?
Yes. CDC captures every change as an event. Streaming SQL with window functions (LAG, LEAD) naturally tracks effective dates and current flags — no batch SCD job needed.
How does SCD in streaming compare to dbt snapshots?
dbt snapshots run on a schedule (hourly/daily) and may miss rapid changes. Streaming CDC captures every change immediately, providing complete history without gaps.

