Replacing Lambda Architecture with RisingWave and Apache Iceberg

Lambda Architecture maintains two separate pipelines — a batch layer for accuracy and a stream layer for speed — which doubles operational complexity and often produces inconsistent results between layers. RisingWave with Apache Iceberg replaces both layers with a single streaming pipeline: one SQL codebase, one storage layer, sub-minute latency, and batch-quality accuracy through Iceberg's ACID guarantees.

The Problem with Lambda Architecture

Lambda Architecture was proposed by Nathan Marz in 2011 as a solution to the "accuracy vs. latency" trade-off in data pipelines. The idea: run two parallel pipelines from the same source data.

Batch layer — A nightly or hourly Spark/Hadoop job processes all historical data and produces accurate, complete results. Stored in a database or data warehouse.

Speed layer — A streaming job (Storm, Flink, Kafka Streams) processes recent events and produces approximate, low-latency results. Stored in a fast serving layer (Redis, Cassandra).

Serving layer — A query router merges results from both layers, using batch results for historical data and speed results for recent data.

This architecture solves the latency problem but creates new ones:

Duplicate logic — The same business logic must be implemented twice: once in batch (usually Spark/SQL) and once in streaming (usually Java/Scala). When logic changes, both implementations must be updated in sync.

Inconsistent results — The batch and speed layers often produce slightly different results for the same time period due to implementation differences, reprocessing timing, or data ordering issues.

Operational overhead — Two separate clusters, two separate deployment pipelines, two separate monitoring setups, and a serving layer to maintain.

The Kappa Simplification

Jay Kreps (co-creator of Kafka) proposed the Kappa Architecture in 2014: use only a streaming layer, and reprocess historical data through the same streaming pipeline when needed.

Kappa is simpler than Lambda but has its own limitations: streaming systems weren't designed for efficient large-scale historical reprocessing, and the serving layer still needed separate infrastructure.

RisingWave + Iceberg: A Better Answer

RisingWave with Apache Iceberg addresses the remaining gaps in Kappa Architecture:

Single SQL codebase — Write your business logic once in SQL. RisingWave handles both real-time and historical data through the same materialized view definitions.

Iceberg as the serving layer — Instead of a separate serving database, Iceberg on S3 serves as the unified storage layer. It's queryable by BI tools, supports ACID transactions, and costs a fraction of a managed database.

Reprocessing through replay — When you need to reprocess history (bug fix, logic change), replay Kafka events or re-scan source data. RisingWave rebuilds materialized views from scratch.

No merging required — There's no "merge batch and speed results" step. There is only one result, continuously updated.

Migrating from Lambda to RisingWave + Iceberg

Current Lambda Architecture (Example)

Kafka → Flink (speed layer) → Redis (recent data)
Kafka → Spark batch (batch layer) → Snowflake (historical)
Query API → Merges Redis + Snowflake results

Target Architecture with RisingWave + Iceberg

Kafka → RisingWave (materialized views) → Iceberg on S3
                                               │
                              ┌────────────────┴────────────────┐
                              ▼                                  ▼
                         Athena / Trino                   RisingWave direct
                         (batch analytics)                (live dashboards)

Step 1: Replace the Speed Layer

Create a RisingWave source from the same Kafka topic your Flink job was consuming:

CREATE SOURCE orders_kafka (
    order_id VARCHAR,
    customer_id BIGINT,
    product_id BIGINT,
    quantity INT,
    unit_price NUMERIC(10,2),
    region VARCHAR,
    status VARCHAR,
    order_time TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'orders',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Rebuild your streaming aggregation in SQL (replacing Flink's DataStream API):

CREATE MATERIALIZED VIEW orders_hourly AS
SELECT
    region,
    window_start,
    window_end,
    COUNT(*) AS order_count,
    SUM(unit_price * quantity) AS gross_revenue,
    COUNT(DISTINCT customer_id) AS unique_customers,
    AVG(unit_price * quantity) AS avg_order_value
FROM TUMBLE(orders_kafka, order_time, INTERVAL '1 HOUR')
WHERE status != 'cancelled'
GROUP BY region, window_start, window_end;

Step 2: Replace the Batch Layer

Sink to Iceberg instead of running a batch Spark job:

CREATE SINK orders_hourly_to_iceberg AS
SELECT * FROM orders_hourly
WITH (
    connector = 'iceberg',
    type = 'upsert',
    catalog.type = 'rest',
    catalog.uri = 'http://iceberg-catalog:8181',
    warehouse.path = 's3://analytics-lake/warehouse',
    s3.region = 'us-east-1',
    database.name = 'commerce',
    table.name = 'orders_hourly'
);

This single sink replaces both the Snowflake batch layer and the Redis speed layer. Iceberg on S3 serves all query engines from the same copy of data.

Step 3: Replace the Serving Layer

Instead of a query API that merges Redis and Snowflake results, point your BI tools directly at Iceberg (via Athena or Trino) or at RisingWave for live results.

For applications that need sub-second query response, RisingWave's PostgreSQL-compatible interface serves materialized view results directly.

Before and After: Complexity Comparison

Dimension	Lambda Architecture	RisingWave + Iceberg
Number of codebases	2 (batch + stream)	1 (SQL)
Infrastructure components	Flink + Spark + Redis + Snowflake	RisingWave + Iceberg on S3
Data freshness	1–2 hours (batch) / < 1 min (stream)	< 1 minute (unified)
Data accuracy	Often inconsistent between layers	Consistent (single source)
Reprocessing complexity	Very high (re-run both layers)	Medium (replay through RisingWave)
Monthly infra cost (est.)	$10,000–50,000+	$1,000–5,000
Engineering maintenance	High (two systems)	Low (one SQL system)

Handling Historical Reprocessing

One of Lambda's original justifications was that streaming systems can't efficiently reprocess history. RisingWave addresses this through:

Kafka retention — Keep Kafka topics retained for 7–30 days. For reprocessing, set scan.startup.mode = 'earliest' on a new source and let RisingWave replay all historical messages.

Iceberg as the historical store — For data older than Kafka retention, Iceberg on S3 becomes the replay source. RisingWave v2.8+ can read from Iceberg as a source, enabling full historical reprocessing without Kafka.

Stateless logic — Design your SQL to be stateless where possible (tumble windows, not running totals) so reprocessing produces identical results.

Migration Strategy

Don't cut over all at once. A pragmatic migration from Lambda:

Run in parallel — Deploy RisingWave alongside your existing Lambda pipeline. Compare outputs.
Validate accuracy — For 2–4 weeks, verify RisingWave's results match your batch layer within acceptable tolerance.
Cut over dashboards — Migrate BI tools and dashboards to read from Iceberg / RisingWave.
Decommission — Shut down Redis, the Flink job, and the query merger layer. Later, decommission the Spark batch job.

Real-World Cost Impact

A typical Lambda architecture at medium scale involves:

Flink cluster: 8 nodes × $0.50/hour = $2,880/month
Spark EMR for batch: 20 nodes × 3h/day × $0.50/hour = $900/month
Redis cluster: 3 nodes × $0.30/hour = $648/month
Snowflake compute: ~$3,000/month
Total: ~$7,428/month

RisingWave + Iceberg on S3:

RisingWave cluster: 4 nodes × $0.50/hour = $1,440/month
S3 storage: 10TB × $0.023 = $230/month
Athena queries: $5/TB × estimated usage = ~$300/month
Total: ~$1,970/month — 73% reduction

FAQ

Q: Can RisingWave really replace Spark for batch accuracy? Yes. RisingWave uses incremental computation — it only processes changed data, not full table scans. For windowed aggregations, results are identical whether computed incrementally or in batch. The key is using idempotent, deterministic SQL logic.

Q: What about exactly-once processing guarantees? RisingWave provides exactly-once semantics end-to-end. Kafka source offsets are committed in lockstep with Iceberg snapshot commits. If a failure occurs, RisingWave resumes from the last committed Kafka offset, re-computing the exact same results.

Q: How do we handle late-arriving events? Use TUMBLE() windows with appropriate late-arrival bounds. RisingWave will update the correct window when late events arrive. For events arriving very late (beyond the window close), use append-only Iceberg sinks and handle late data in the query layer.

Q: Is this architecture suitable for regulated industries (finance, healthcare)? Yes. Iceberg's ACID guarantees, time travel, and complete audit trail make it suitable for regulated environments. All data changes are tracked in Iceberg snapshots. Combined with S3's encryption and access controls, this architecture meets most compliance requirements.

Q: What if we already have significant investment in our Spark batch pipelines? Migrate incrementally. Keep Spark for complex batch transformations (model training, large historical backfills) while replacing the operational/reporting pipelines with RisingWave. The two systems can share the same Iceberg tables on S3.

Replace Your Lambda Architecture Today

The Lambda Architecture solved a real problem in 2011 with the tools available then. Today, RisingWave and Apache Iceberg make it obsolete — delivering better latency, better accuracy, and dramatically lower complexity.

Start the migration with RisingWave's documentation and connect with engineers who have already made this transition in the RisingWave Slack community.