How to Build a Real-Time Notification System with Streaming SQL
Migrating from batch ETL (dbt + Airflow) to streaming ETL (RisingWave) can be done incrementally — one pipeline at a time. Here's a step-by-step approach.
Migration Strategy
Phase 1: Identify High-Value Pipelines
Pick pipelines where freshness matters most:
- Real-time dashboards currently refreshed hourly
- CDC replication with nightly batch loads
- Alerting based on stale data
Phase 2: Set Up Streaming Source
-- Replace scheduled full-table extract with CDC
CREATE SOURCE pg_source WITH (connector='postgres-cdc', ...);
CREATE TABLE orders (...) FROM pg_source TABLE 'public.orders';
Phase 3: Convert dbt Models to Materialized Views
-- dbt model (batch, hourly):
-- SELECT region, SUM(amount) FROM orders GROUP BY region
-- RisingWave equivalent (streaming, real-time):
CREATE MATERIALIZED VIEW revenue_by_region AS
SELECT region, SUM(amount) as revenue FROM orders GROUP BY region;
Phase 4: Sink to Same Destination
-- Sink to the same warehouse/lakehouse your dbt models target
CREATE SINK to_iceberg AS SELECT * FROM revenue_by_region
WITH (connector='iceberg', ...);
Phase 5: Decommission Batch Pipeline
Once the streaming pipeline is stable, disable the corresponding Airflow DAG and dbt model.
What Changes, What Stays
| Aspect | Batch (Before) | Streaming (After) |
| Orchestration | Airflow DAGs | Not needed (always-on) |
| Transformation | dbt SQL | RisingWave SQL (very similar) |
| Freshness | Hourly | Sub-second |
| Monitoring | DAG success/failure | Pipeline health views |
Frequently Asked Questions
Can I run batch and streaming in parallel during migration?
Yes. Run both pipelines simultaneously and compare results. Once the streaming pipeline matches the batch output, decommission the batch pipeline.
How similar is dbt SQL to RisingWave SQL?
Very similar. Both use standard SQL. The main difference: dbt models are run on schedule; RisingWave materialized views run continuously. Most dbt SQL translates directly to RisingWave with minimal changes.

