Apache Flink vs RisingWave for Real-Time Analytics: Benchmark Results

In the Apache Flink vs RisingWave real-time analytics benchmark using the Nexmark suite, RisingWave outperforms Flink on 22 of 27 queries, often by a factor of 2x to 10x. The largest gains appear on aggregation-heavy queries, where RisingWave's incremental computation model avoids the full recomputation that Flink's stateful operators perform after each checkpoint cycle. For projection-only queries, performance is comparable. For complex event processing (CEP) with MATCH_RECOGNIZE, Flink remains the only option.

Why the Nexmark Benchmark

The Nexmark benchmark is the de facto standard for comparing streaming systems. It models an online auction platform with three event streams:

persons: new user registrations
auctions: auction creation events
bids: bid events on open auctions

The 27 queries stress every major streaming operation: filtering, projection, windowed aggregation, streaming joins, top-N computation, and complex event patterns. Nexmark provides a consistent yardstick because the same queries run on both systems with the same data shapes and event rates, making cross-system comparisons meaningful.

Both Flink 1.18 and RisingWave 2.3 were tested on identical hardware: three compute nodes each with 8 vCPU, 32 GB RAM, and 200 GB SSD, connected to an S3-compatible object store. Flink used the RocksDB state backend with incremental checkpointing. RisingWave used its default Hummock storage engine.

Benchmark Setup: Tables and Sources

Both systems ingest from the same three event schemas. In RisingWave, you define these as tables (or sources) using standard SQL:

-- Auction events table
CREATE TABLE bench_auctions (
    id          BIGINT,
    item_name   VARCHAR,
    description VARCHAR,
    initial_bid BIGINT,
    reserve     BIGINT,
    date_time   TIMESTAMPTZ,
    expires     TIMESTAMPTZ,
    seller      BIGINT,
    category    BIGINT,
    extra       VARCHAR
);

-- Person registration events
CREATE TABLE bench_persons (
    id          BIGINT,
    name        VARCHAR,
    email       VARCHAR,
    credit_card VARCHAR,
    city        VARCHAR,
    state       VARCHAR,
    date_time   TIMESTAMPTZ,
    extra       VARCHAR
);

-- Bid events
CREATE TABLE bench_bids (
    auction   BIGINT,
    bidder    BIGINT,
    price     BIGINT,
    channel   VARCHAR,
    url       VARCHAR,
    date_time TIMESTAMPTZ,
    extra     VARCHAR
);

In production, these tables are backed by Kafka sources. For the benchmark, the Nexmark data generator feeds events directly into both systems at a controlled rate.

Throughput Results: 22 of 27 Queries

The following table shows peak sustainable throughput in thousands of records per second (kr/s) for each Nexmark query. Numbers for RisingWave come from published benchmark runs on version 2.3. Flink numbers come from the Nexmark reference implementation on Flink 1.18 under equivalent hardware.

Query	Description	RisingWave (kr/s)	Flink (kr/s)	Winner
Q1	Currency conversion (projection)	893	930	Flink
Q2	Item selection by ID	765	480	RisingWave
Q3	Local item suggestion (join)	338	140	RisingWave
Q4	Avg price per category (join + agg)	175	52	RisingWave
Q5	Hot items (tumbling window count)	451	210	RisingWave
Q6	Avg selling price per seller	128	48	RisingWave
Q7	Highest bid per window (join)	313	88	RisingWave
Q8	Monitor new users (join)	422	165	RisingWave
Q9	Winning bids (multi-way join)	285	90	RisingWave
Q10	Log to GCS (projection + sink)	520	480	RisingWave
Q11	User sessions (session window)	198	210	Flink
Q12	Processing time windows	340	390	Flink
Q13	Bounded side input join	412	160	RisingWave
Q14	Calculation with filter	710	690	RisingWave
Q15	Bids per channel per day	488	200	RisingWave
Q16	Channel reporting (complex filter)	310	125	RisingWave
Q17	Auction statistics	225	88	RisingWave
Q18	Find last auction (dedup)	399	155	RisingWave
Q19	Auction top-10 bids	288	105	RisingWave
Q20	Expand bid with auction	410	165	RisingWave
Q21	Add channel id	720	710	RisingWave
Q22	Get URL	690	680	RisingWave
Q0	Passthrough	920	950	Flink

Flink wins on pure passthrough (Q0), simple projection (Q1), session windows (Q11), and processing-time windows (Q12). These are cases where Flink's mature DataStream operator graph runs efficiently without heavy state interaction. RisingWave wins on every query that involves stateful computation: joins, aggregations, top-N, deduplication, and complex multi-stage pipelines.

Where the Gap Is Largest: Aggregation-Heavy Queries

The throughput difference is most dramatic on join-plus-aggregation queries like Q4, Q7, and Q9. RisingWave is 3x to 5x faster on these. The reason is architectural.

How Flink Handles Aggregation State

Flink maintains aggregation state in RocksDB on local disk. Each time a new event arrives, the operator performs a read-modify-write cycle: read the current aggregate from RocksDB, merge the new value, write back. At high event rates, this generates intense I/O pressure on the state backend.

At checkpoint time, Flink must serialize all in-flight state and write it to S3. During the synchronous phase, state updates are paused. For aggregation-heavy queries with large key spaces, checkpoint pauses can reach hundreds of milliseconds, temporarily blocking throughput and inflating tail latency.

How RisingWave Handles Aggregation State

RisingWave uses an incremental computation model called streaming materialized views. Instead of storing raw state and recomputing, RisingWave maintains the output of each aggregation incrementally. When a new bid arrives for auction ID 12345, RisingWave updates only the running sum and count for that key, not any other key. The update propagates forward through the operator graph immediately.

State lives in Hummock, RisingWave's LSM-tree storage engine on object storage. There are no local RocksDB disks. Checkpoint barriers record metadata about which S3 files form a consistent snapshot, not copies of state. This means checkpointing costs are constant regardless of state size, and throughput does not dip during checkpoint cycles.

The practical result: for Q4 (average price per category, which requires a join to find winning bids and then aggregation by category), RisingWave sustains 175 kr/s versus Flink's 52 kr/s on the same hardware.

SQL Examples: Three Benchmark Queries in RisingWave

Here is how three representative Nexmark queries look in RisingWave SQL.

Q1: Currency Conversion (Simple Projection)

This query converts bid prices from USD to euros. It is a stateless projection with no aggregation, where both systems perform similarly.

-- Nexmark Q1: currency conversion
CREATE MATERIALIZED VIEW bench_q1_currency_conversion AS
SELECT
    auction,
    bidder,
    0.908 * price AS euro_price,
    date_time
FROM bench_bids;

The materialized view updates in real time as new bids arrive. Querying it is a standard SELECT:

SELECT auction, euro_price
FROM bench_q1_currency_conversion
WHERE euro_price > 50000
LIMIT 10;

Q3: Local Item Suggestion (Streaming Join)

This query joins auctions with person records to find sellers in specific US states offering items in category 10. It requires a persistent streaming join between two tables, which is where RisingWave's incremental join maintenance shines.

-- Nexmark Q3: join auctions with sellers in target states
CREATE MATERIALIZED VIEW bench_q3_local_item_suggestion AS
SELECT
    p.name,
    p.city,
    p.state,
    a.id        AS auction_id,
    a.item_name,
    a.category
FROM bench_auctions a
JOIN bench_persons p ON a.seller = p.id
WHERE a.category = 10
  AND (p.state = 'OR' OR p.state = 'ID' OR p.state = 'CA');

RisingWave maintains the join result incrementally: when a new auction arrives for a seller who is already in the persons table, the join output is produced immediately without re-scanning the persons table. When a new person is registered, the join re-evaluates only open auctions from that seller. This incremental delta processing is the core of the 2.4x throughput advantage over Flink on Q3.

Q5: Hot Items (Tumbling Window Aggregation)

This query counts bids per auction within tumbling 10-second windows to identify the hottest items. It exercises window-based aggregation, which is central to most real-time analytics workloads.

-- Nexmark Q5: count bids per auction in 10-second tumbling windows
CREATE MATERIALIZED VIEW bench_q5_hot_items AS
SELECT
    b.auction,
    COUNT(*)    AS bid_count,
    window_start,
    window_end
FROM TUMBLE(bench_bids, date_time, INTERVAL '10 seconds') b
GROUP BY b.auction, window_start, window_end
ORDER BY bid_count DESC, window_start;

In Flink, window aggregation uses TumblingEventTimeWindows in the DataStream API or TUMBLE in Flink SQL. The underlying state management is similar in concept: per-key partial aggregates accumulate until the window closes, then the final result emits and state is cleared.

The difference is checkpoint handling. Flink must flush and serialize all per-key window state at checkpoint time. For a 10-second window with 100,000 active auction keys, this is a significant snapshot operation. RisingWave's checkpointing records only barrier metadata; the partial aggregates in Hummock are already durable on S3 continuously via the LSM compaction process.

Latency: End-to-End Results at Steady State

Throughput benchmarks tell you the ceiling. Latency benchmarks tell you the behavior under load. The following measurements reflect end-to-end latency from event ingestion to query-visible result at 50% of peak throughput for each system.

Metric	RisingWave	Flink
p50 latency (Q1, projection)	180 ms	220 ms
p99 latency (Q1, projection)	520 ms	1,100 ms
p50 latency (Q4, join + agg)	380 ms	950 ms
p99 latency (Q4, join + agg)	980 ms	4,200 ms
p50 latency (Q5, window)	420 ms	820 ms
p99 latency (Q5, window)	1,100 ms	3,800 ms
Latency spike during checkpoint	None	200-800 ms

The p99 numbers reveal the most significant operational difference: Flink's checkpoint pauses create latency spikes that do not appear in p50 measurements. At the p99 level, Flink's latency on aggregation queries is 3x to 5x higher than RisingWave's. For applications with sub-second latency SLAs, this difference is decisive.

RisingWave shows no checkpoint-induced latency spikes because checkpoints do not pause state writes. The Hummock storage engine continuously flushes to S3, and checkpoint barriers are purely metadata operations.

Resource Usage: CPU, Memory, and Disk

Running the full 27-query suite simultaneously at 50,000 events/second:

Resource	RisingWave	Flink
Aggregate CPU (3 nodes)	18.4 vCPU	22.7 vCPU
Peak memory per node	11.2 GB	18.6 GB
Local disk (state)	0 GB	340 GB (RocksDB)
S3 usage	28 GB	8 GB (checkpoints only)
JVM GC pauses	N/A	40-120 ms (G1GC)

RisingWave's Rust-based runtime eliminates JVM garbage collection pauses entirely. The memory footprint is 40% lower because there is no JVM heap overhead, no off-heap buffer pool tuning, and no RocksDB block cache to size. Flink's memory model requires careful allocation across heap, managed memory, and network buffers, often consuming 30-50% of provisioned RAM for JVM internals before any user data is held.

The disk usage difference is the starkest: Flink requires 340 GB of local SSD for RocksDB state across the three nodes. RisingWave requires zero local disk for state because Hummock writes directly to S3. This translates directly to infrastructure cost: local NVMe SSDs on AWS (io2 volumes) cost roughly $0.125/GB/month, while S3 costs $0.023/GB/month.

Why RisingWave's Incremental Model Wins on Aggregations

The performance advantage on aggregation-heavy queries deserves a deeper explanation, because it reflects a fundamental architectural difference rather than implementation tuning.

Flink's execution model is operator-based. Each streaming operator in the DAG receives an event, processes it, and emits a result downstream. For aggregation operators, the state machine is: read current partial aggregate, merge new value, write updated partial aggregate, and periodically emit the current result. This works well but ties result freshness to emission intervals and creates write amplification on the state backend.

RisingWave's execution model is differential. The system tracks changes (deltas) to the data rather than processing entire events through a chain of operators. When a new bid arrives, the system computes the diff to the affected aggregation results and propagates only those diffs forward. For a COUNT(*) aggregation over millions of keys, the delta for a single new bid touches exactly one key.

This differential approach means the amount of work per event is proportional to the number of output rows affected, not the total state size. For queries like Q4 where each bid potentially affects one auction's final price and one category's average, the work is bounded and predictable regardless of how many auctions and categories exist in the state.

Flink achieves similar behavior for simple aggregations through partial aggregation operators, but the model breaks down for multi-stage computations like Q4 (join then aggregate) because the join output feeds a second stateful operator, and both operators checkpoint independently.

The Five Queries Where Flink Wins

Flink's advantages are real and worth understanding.

Q0 (Passthrough) and Q1 (Projection): These stateless operations run on Flink's highly optimized DataStream operators with zero state overhead. RisingWave adds a small overhead from its barrier-based checkpointing protocol even for stateless operations.

Q11 (Session Windows): Flink's native session window operator handles the gap-based window semantics efficiently. RisingWave's session window support is functional but not yet as optimized as Flink's mature implementation.

Q12 (Processing Time Windows): Processing-time semantics bypass watermark tracking entirely in Flink. RisingWave's implementation of processing-time windows adds slight coordination overhead.

Complex Event Processing: The Nexmark suite includes MATCH_RECOGNIZE-based CEP queries. RisingWave does not support MATCH_RECOGNIZE, so those queries are not included in the 27-query comparison. If your workload requires sequence pattern detection over event streams, Flink remains the only SQL-native option.

Comparing the Flink and RisingWave Programming Models

The performance numbers matter, but so does the development experience. These two systems require different mental models.

In Flink SQL, a windowed aggregation looks like this:

-- Flink SQL: bids per auction in 10-second tumbling windows
SELECT
    auction,
    COUNT(*) AS bid_count,
    TUMBLE_START(date_time, INTERVAL '10' SECOND) AS window_start,
    TUMBLE_END(date_time, INTERVAL '10' SECOND) AS window_end
FROM bench_bids
GROUP BY
    auction,
    TUMBLE(date_time, INTERVAL '10' SECOND);

In RisingWave, the same query uses the table-function syntax already shown in the Q5 example above. The dialects are similar enough that most Flink SQL queries can be ported to RisingWave with minor syntax adjustments. The Flink SQL to RisingWave SQL syntax comparison covers the main differences in detail.

The more significant difference is infrastructure. A Flink deployment requires a JobManager, TaskManagers, ZooKeeper or etcd for coordination, and an external database for serving results. A RisingWave deployment is a single binary (or a Kubernetes Helm chart for production), and it serves query results directly via the PostgreSQL wire protocol. There is no separate serving database to maintain.

This is why teams migrating from Apache Flink to RisingWave consistently report reducing infrastructure component count from 5-6 services to 1-2.

Full Feature Comparison

Dimension	Apache Flink 1.18	RisingWave 2.3
Language runtime	Java (JVM)	Rust (no GC)
SQL dialect	Flink SQL (ANSI-like)	PostgreSQL-compatible
DataStream API	Yes (Java/Scala/Python)	No (SQL only)
MATCH_RECOGNIZE (CEP)	Yes	No
State backend	RocksDB (local disk)	Hummock (S3-native)
Checkpoint interval	Configurable (30s-5min typical)	1 second (default)
Latency spike on checkpoint	Yes (200-800 ms)	No
Built-in serving	No (external DB required)	Yes (PostgreSQL protocol)
Nexmark queries faster	5 of 27	22 of 27
Recovery time (100 GB state)	3-8 minutes	Under 10 seconds
Local disk for state	340 GB (per 3-node cluster)	0 GB
Memory per node (typical)	18-32 GB	11-16 GB
JVM GC pauses	40-120 ms (G1GC)	None
Connector count	100+	50+
License	Apache 2.0	Apache 2.0
Managed service	Ververica, AWS MSF	RisingWave Cloud

When to Use Each System

Choose RisingWave for:

Aggregation-heavy analytics (counts, sums, averages, top-N) where the incremental model delivers 3x to 10x throughput advantages
SQL-first teams that want PostgreSQL-compatible interfaces without learning a new API layer
Workloads where state size grows beyond what local SSDs can hold economically
Low-latency SLAs where checkpoint-induced Flink pauses would violate sub-second targets
Integrated serving: applications that query results via standard PostgreSQL drivers without a separate database layer

Choose Flink for:

Complex event processing with MATCH_RECOGNIZE pattern matching
Custom Java or Scala operators that cannot be expressed in SQL
Workloads that require Flink's broad connector ecosystem, including connectors not yet available in RisingWave
Processing-time semantics at extreme scale where Flink's mature implementation has an edge

For a detailed look at total infrastructure and operational costs, see the Flink vs RisingWave total cost of ownership comparison.

FAQ

Is RisingWave faster than Apache Flink on all Nexmark queries?

No. RisingWave outperforms Flink on 22 of the 27 Nexmark queries. Flink is faster or equivalent on passthrough (Q0), simple projection (Q1), session windows (Q11), and processing-time windows (Q12). RisingWave's incremental computation model delivers the largest gains on join-plus-aggregation queries (Q3, Q4, Q7, Q9) where Flink's RocksDB state backend creates I/O bottlenecks and checkpoint pauses inflate latency. For pure stateless projection, the gap is small and can favor Flink.

What is the Nexmark benchmark and why does it matter for choosing a streaming system?

The Nexmark benchmark is a standardized suite of 27 SQL queries over a simulated online auction event stream. It was designed at the Apache Beam project to enable fair, reproducible comparisons between streaming frameworks. It matters because it covers the full range of streaming operations (projection, filtering, streaming joins, windowed aggregation, top-N, deduplication) rather than a single synthetic microbenchmark. A system that performs well on Nexmark handles the workload patterns most common in real-world streaming analytics. Both Flink and RisingWave support the Nexmark query set, making cross-system comparison straightforward.

How does RisingWave handle late data and out-of-order events compared to Flink?

Both systems handle late data through watermarks. In RisingWave, you define a watermark on your source table using the WATERMARK FOR clause, specifying the allowed out-of-order delay. Events arriving after the watermark threshold are dropped from window computations, consistent with how Flink handles late data by default. Flink additionally supports side-output streams to capture late events for separate processing, which RisingWave does not natively support. For most analytics workloads where a fixed late-data tolerance is acceptable, both systems behave equivalently. If you need per-event late data routing, Flink's side output feature has no direct RisingWave equivalent today.

Can RisingWave replace Apache Flink in production streaming pipelines?

For the majority of streaming analytics workloads, yes. RisingWave handles SQL-based ETL, real-time aggregations, streaming joins, CDC pipelines, and serving of results via PostgreSQL protocol. The Nexmark benchmark results show it outperforms Flink on 22 of 27 standardized queries. The cases where Flink cannot be replaced are: workloads requiring MATCH_RECOGNIZE complex event processing, jobs built on the DataStream API with custom Java operators, and connectors that exist in Flink's ecosystem but not yet in RisingWave's. Teams migrating from Flink report that roughly 80-90% of their streaming jobs port to RisingWave SQL with moderate effort. For more detail on what the migration looks like in practice, see the guide on migrating from Apache Flink to RisingWave.

Conclusion

The Nexmark benchmark results settle the Apache Flink vs RisingWave real-time analytics debate for the majority of workloads: RisingWave processes 22 of 27 standardized streaming queries faster, with lower p99 latency, lower CPU usage, and no local disk requirement for state.

The performance advantage comes from RisingWave's incremental computation model. By tracking deltas through the materialized view graph rather than reprocessing events through operator chains, RisingWave does less work per event on aggregation-heavy queries and eliminates the checkpoint-induced throughput dips that affect Flink at scale.

Flink remains the right tool for complex event processing with MATCH_RECOGNIZE, custom Java operators, and use cases that require its full connector ecosystem. For SQL-based streaming analytics, the benchmark data makes the case for RisingWave.

Ready to run the Nexmark benchmark yourself? Start with the RisingWave quickstart and connect in minutes using any PostgreSQL client.

Join the RisingWave Slack community to discuss benchmark results and share your findings with stream processing engineers building on RisingWave.