How to Handle Schema Changes in Production Streaming Pipelines

How to Handle Schema Changes in Production Streaming Pipelines

How to Handle Schema Changes in Production Streaming Pipelines

Schema changes in production streaming pipelines — new columns, renamed fields, type changes — can break downstream consumers. This guide covers safe schema change strategies: backward-compatible changes, schema registries, and RisingWave's approach to schema evolution.

Overview

Schema changes in production streaming pipelines — new columns, renamed fields, type changes — can break downstream consumers. This guide covers safe schema change strategies: backward-compatible changes, schema registries, and RisingWave's approach to schema evolution.

Why Streaming?

Traditional batch approaches compute these metrics hourly or daily. Streaming SQL computes them continuously — every event is processed within milliseconds, and materialized views always reflect the current state.

Implementation Pattern

-- Universal streaming pattern
CREATE SOURCE events (...) WITH (connector='kafka', topic='events', ...);

CREATE MATERIALIZED VIEW metrics AS
SELECT dimension, COUNT(*) as count, SUM(amount) as total,
  AVG(amount) as avg_amount, MAX(ts) as last_event
FROM events WHERE ts > NOW() - INTERVAL '24 hours'
GROUP BY dimension;

-- Query with any PostgreSQL client
SELECT * FROM metrics ORDER BY count DESC;

Architecture

Data Sources → RisingWave (SQL Processing) → Materialized Views (Serving)
                                            → Iceberg Sink (Historical)

RisingWave provides both real-time serving (via PostgreSQL protocol) and historical storage (via Iceberg sink). Applications, dashboards, and AI agents query the same materialized views.

Key Benefits

  • Sub-second freshness: Views update with every event
  • SQL-only: No Java, no custom code, no streaming frameworks
  • PostgreSQL compatible: Use psql, Grafana, Metabase, any PG driver
  • Open source: Apache 2.0, self-hostable, no vendor lock-in
  • S3 state: Elastic scaling, fast recovery, cost-efficient storage

Frequently Asked Questions

How fresh is the data?

RisingWave materialized views update within milliseconds of each event. Point queries return in 10-20ms p99. This is real-time — not near-real-time.

Do I need to change my application code?

No. RisingWave speaks PostgreSQL protocol. Any application that queries PostgreSQL can query RisingWave without code changes. Just change the connection string.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.