Iceberg + Materialized Views

Materialized Views on Apache Iceberg

Build streaming materialized views on Apache Iceberg data. RisingWave maintains always-fresh query results from Iceberg tables with sub-second latency — no batch refreshes, no stale data.

Streaming
Not Batch
Continuously maintained MVs, not periodic batch refreshes like AWS Glue or Cloudera
Sub-second
Query Freshness
Materialized view results update incrementally as new Iceberg snapshots arrive
SQL
PostgreSQL Compatible
Define and query MVs with standard SQL using psql, JDBC, or any PostgreSQL client
Join
Iceberg + Kafka
Combine historical Iceberg data with real-time Kafka streams in a single materialized view

Overview

What are materialized views on Apache Iceberg and why do they matter?

Materialized views on Iceberg pre-compute and store query results so dashboards and applications get instant responses instead of running expensive queries on raw data. Traditional batch-refresh approaches leave data stale for hours. Streaming materialized views maintain freshness continuously, making Iceberg data actionable in real time.

  • Pre-compute expensive aggregations, joins, and filters on Iceberg data
  • Serve dashboard queries in single-digit milliseconds instead of minutes
  • Eliminate the need for periodic REFRESH commands or scheduled Spark jobs
  • Reduce OLAP engine costs by offloading repetitive queries to materialized views
  • Enable real-time analytics on data that was previously only available in batch

How It Works

How does RisingWave create streaming materialized views from Iceberg data?

RisingWave connects to your Iceberg catalog as a source, then you define materialized views using standard SQL. RisingWave incrementally processes new Iceberg snapshots as they arrive, updating materialized view results in real time without full recomputation. Results are queryable via any PostgreSQL client.

Incremental Processing

Only new and changed data from Iceberg snapshots is processed — no full table scans on each update

Join Iceberg + Kafka

Combine historical Iceberg data with real-time Kafka streams in a single materialized view

PostgreSQL Interface

Query materialized views using psql, JDBC, or any PostgreSQL-compatible tool with sub-second response

Automatic Maintenance

No manual REFRESH commands. Views stay current as new data arrives in Iceberg tables

Comparison

How do streaming MVs on Iceberg compare to batch-refresh approaches?

Batch-refresh materialized views in tools like AWS Glue or Cloudera recompute results on a schedule, leaving data stale between runs. RisingWave streaming materialized views process changes incrementally as they arrive, delivering always-fresh results without the compute waste of full recomputation.

FactorAWS Glue MVCloudera MVRisingWave MV
Refresh ModelScheduled batchScheduled batchContinuous streaming
Data FreshnessHours (between refreshes)Hours (between refreshes)Seconds (incremental)
Compute CostFull recompute each refreshFull recompute each refreshIncremental — process only changes
Join Kafka + IcebergNot supportedNot supportedNative support
Query InterfaceAthena / Spark SQLImpala / HivePostgreSQL-compatible

Frequently Asked Questions

Can I create materialized views that join Iceberg and Kafka data?
How fresh are materialized views from Iceberg sources?
Does this replace my OLAP engine?
What's the difference between Iceberg MVs in RisingWave vs Spark?

Ready to build streaming MVs on Iceberg?

Create always-fresh materialized views on your Iceberg data with SQL.

Build Iceberg Materialized Views
Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.