Flink vs RisingWave: Which Handles Complex Joins Better?

Flink vs RisingWave: Which Handles Complex Joins Better?

Joins are the hardest problem in stream processing. In batch systems, a join is a point-in-time snapshot operation. In streaming, you are joining two infinite sequences of events where neither side has a defined end. State accumulates without bound. Late events can arrive after the join window closes. Dimension tables get updated and you have to decide which version of the data to use when matching.

Apache Flink and RisingWave both solve this problem, but they take fundamentally different paths. Flink approaches joins through a DataStream API and a Table API that require explicit watermark declarations, connector configurations, and sometimes Java glue code. RisingWave handles them through PostgreSQL-compatible SQL with streaming semantics built in at the engine level.

This post does a head-to-head comparison across the four join patterns that matter most in production streaming pipelines: stream-stream joins, stream-table joins (both lookup and full state), temporal joins, and interval joins. For each pattern, you will see the syntax in both systems, understand what is happening under the hood, and see real output from verified RisingWave queries running on RisingWave 2.8.0.

What Makes Streaming Joins Hard

Before comparing the two systems, it helps to understand why joins in streaming are fundamentally different from joins in batch databases.

A batch JOIN can read both sides completely, sort or hash one side into memory, and probe against it. The data is bounded and immutable during the operation. In streaming:

  • Both sides are unbounded. New rows arrive at any moment.
  • State must be maintained indefinitely unless you bound it with time.
  • Late events can arrive after a window closes.
  • Dimension data can be updated, creating a versioning problem.
  • Retracted rows (deletes and updates) must propagate through join results.

These constraints produce four distinct join patterns, each suited to a different combination of stream mutability, time semantics, and state budget.

PatternLeft SideRight SideState CostBest For
Stream-stream joinDynamicDynamicHigh (both sides)Correlating two event streams
Stream-table joinDynamicMutable tableMedium (left side only)Enrichment with updatable dimensions
Temporal joinDynamicPoint-in-time tableLow (lookup only)Enrichment at processing time
Interval joinDynamicDynamicMedium (bounded by time window)Time-bounded stream correlation

Setting Up the Examples

All SQL in this post runs on RisingWave 2.8.0 connected via psql. Table names use the jn_ prefix to avoid conflicts with other schemas.

-- Products dimension table
CREATE TABLE jn_products (
    product_id   INT PRIMARY KEY,
    product_name VARCHAR,
    category     VARCHAR,
    price        DECIMAL
);

-- Orders event stream
CREATE TABLE jn_orders (
    order_id    BIGINT PRIMARY KEY,
    product_id  INT,
    quantity    INT,
    order_time  TIMESTAMPTZ
);

-- Financial trades stream
CREATE TABLE jn_trades (
    trade_id   BIGINT PRIMARY KEY,
    symbol     VARCHAR,
    quantity   INT,
    trade_time TIMESTAMPTZ
);

-- Market quotes stream
CREATE TABLE jn_quotes (
    quote_id   BIGINT PRIMARY KEY,
    symbol     VARCHAR,
    bid_price  DECIMAL,
    ask_price  DECIMAL,
    quote_time TIMESTAMPTZ
);

-- User dimension table
CREATE TABLE jn_users (
    user_id  INT PRIMARY KEY,
    username VARCHAR,
    plan     VARCHAR
);

-- User events stream
CREATE TABLE jn_events_a (
    event_id   BIGINT PRIMARY KEY,
    user_id    INT,
    event_type VARCHAR,
    event_time TIMESTAMPTZ
);

Temporal Joins: Lookup at Processing Time

A temporal join enriches a stream with the current state of a dimension table at the moment each event is processed. It is the most efficient join pattern: only the stream side maintains state (an internal memo-table for non-append-only streams). The dimension table is looked up by key on each event and the result is never retroactively recomputed when the dimension data changes.

If you update a product price from $29.99 to $34.99, orders that were already processed keep their original enrichment. Only orders that arrive after the update see the new price.

Flink implements temporal joins via the FOR SYSTEM_TIME AS OF syntax, using a watermark-driven event-time lookup against a table with a defined primary key:

-- Flink SQL
CREATE TABLE flink_products (
    product_id   INT,
    product_name STRING,
    category     STRING,
    price        DECIMAL(10, 2),
    update_time  TIMESTAMP(3),
    PRIMARY KEY  (product_id) NOT ENFORCED
) WITH (
    'connector'              = 'jdbc',
    'url'                    = 'jdbc:mysql://db:3306/catalog',
    'table-name'             = 'products',
    'lookup.cache.max-rows'  = '10000',
    'lookup.cache.ttl'       = '10s'
);

CREATE TABLE flink_orders (
    order_id    BIGINT,
    product_id  INT,
    quantity    INT,
    order_time  TIMESTAMP(3),
    WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND
) WITH (
    'connector'                    = 'kafka',
    'topic'                        = 'orders',
    'properties.bootstrap.servers' = 'broker:9092',
    'format'                       = 'json'
);

-- Temporal join in Flink: the right side uses FOR SYSTEM_TIME AS OF
SELECT
    o.order_id,
    o.product_id,
    o.quantity,
    p.product_name,
    p.category,
    p.price,
    o.quantity * p.price AS total_value
FROM flink_orders AS o
LEFT JOIN flink_products FOR SYSTEM_TIME AS OF o.order_time AS p
    ON o.product_id = p.product_id;

Key requirements in Flink: the dimension table needs a primary key, the stream side needs a watermark, and you reference the stream's time column in the FOR SYSTEM_TIME AS OF clause. In practice, configuring JDBC connectors with the right lookup cache settings is a common source of production issues.

In RisingWave SQL

RisingWave uses the SQL:2011 FOR SYSTEM_TIME AS OF PROCTIME() syntax. You place the FOR SYSTEM_TIME AS OF PROCTIME() modifier directly after the table name in the FROM clause:

CREATE MATERIALIZED VIEW jn_mv_temporal AS
SELECT
    o.order_id,
    o.product_id,
    o.quantity,
    p.product_name,
    p.category,
    p.price,
    o.quantity * p.price AS total_value,
    o.order_time
FROM jn_orders AS o,
     jn_products FOR SYSTEM_TIME AS OF PROCTIME() AS p
WHERE o.product_id = p.product_id;

Note the join condition goes in the WHERE clause when using this form. The right-side table must have a primary key defined. No watermark configuration is required: PROCTIME() means "look up the dimension table at the time this event is processed by the streaming engine."

Verified output:

 order_id |  product_name   |  category   |  price  | quantity | total_value
----------+-----------------+-------------+---------+----------+-------------
     1001 | Laptop Pro 15   | Electronics | 1299.00 |        2 |     2598.00
     1002 | Wireless Mouse  | Electronics |   29.99 |        5 |      149.95
     1003 | Ergonomic Chair | Furniture   |  449.00 |        1 |      449.00
     1004 | Standing Desk   | Furniture   |  899.00 |        1 |      899.00
     1005 | Laptop Pro 15   | Electronics | 1299.00 |        1 |     1299.00
(5 rows)

Under the Hood

In both systems, a temporal join compiles to a lookup operator. When an event arrives from the left (stream) side, the engine looks up the matching row in the right (dimension) side by primary key and immediately emits an enriched output row.

In Flink, the lookup hits a remote JDBC or KV store through a connector layer with optional caching. In RisingWave, the lookup queries Hummock (the internal LSM-tree storage engine) directly from within the same process. There is no network round-trip for the lookup, which reduces both latency and the surface area for connector-related failures.

Syntax Comparison

AspectFlink SQLRisingWave SQL
Time referenceFOR SYSTEM_TIME AS OF stream.event_timeFOR SYSTEM_TIME AS OF PROCTIME()
PlacementAfter the table name in JOIN clauseAfter the table name in FROM clause
Watermark requiredYes, on the stream sideNo
Connector configRequired (JDBC, HBase, etc.)No connector needed for internal tables
Join condition syntaxON clauseWHERE clause

Stream-Stream Joins: Both Sides Are Live

A stream-stream join matches rows from two unbounded streams. Unlike a temporal join, neither side is a stable dimension: both sides receive new events continuously and both can produce retractions (updates and deletes). The engine must maintain full state for both sides because any incoming row on either side could match a future or past row on the other side.

This is the most resource-intensive join pattern. Without time bounding, state grows without limit. Both Flink and RisingWave support stream-stream joins, but they handle state management differently.

Flink's unbounded stream-stream join maintains state in RocksDB by default. Without a time condition, state accumulates forever and the job will eventually run out of disk. Flink recommends using idle.state.retention in TableConfig to set a maximum state retention, but this is a global setting that applies to all operators:

-- Flink SQL: stream-stream inner join
-- Warning: without time bounds, state grows indefinitely
SELECT
    t.trade_id,
    t.symbol,
    t.quantity,
    q.bid_price,
    q.ask_price,
    t.trade_time,
    q.quote_time
FROM flink_trades AS t
JOIN flink_quotes AS q
    ON t.symbol = q.symbol;

For production deployments, Flink's documentation recommends adding a time-bounded condition (which turns this into an interval join) to keep state bounded. Unbounded stream-stream joins in Flink are generally discouraged in long-running jobs unless you have explicit state TTL configuration.

In RisingWave SQL

RisingWave handles stream-stream joins through its materialized view semantics. A JOIN between two tables (or sources) in a materialized view definition becomes a continuously maintained join. State is stored in Hummock on S3-compatible object storage, so it scales independently of compute:

CREATE MATERIALIZED VIEW jn_mv_stream_stream AS
SELECT
    t.trade_id,
    t.symbol,
    t.quantity,
    q.bid_price,
    q.ask_price,
    t.trade_time,
    q.quote_time
FROM jn_trades AS t
JOIN jn_quotes AS q
    ON t.symbol = q.symbol;

Verified output:

 trade_id | symbol | quantity | bid_price | ask_price
----------+--------+----------+-----------+-----------
        1 | AAPL   |      100 |    174.50 |    174.55
        1 | AAPL   |      100 |    174.60 |    174.65
        2 | GOOG   |       50 |   2820.00 |   2820.10
        3 | AAPL   |      200 |    174.50 |    174.55
        3 | AAPL   |      200 |    174.60 |    174.65
        4 | MSFT   |       75 |    415.20 |    415.25
(6 rows)

Note that AAPL trades match both AAPL quotes because the join is on symbol alone. This is the expected behavior for a non-time-bounded stream-stream join: every trade matches every quote with the same symbol.

Under the Hood

Both systems use a hash-join approach for stream-stream joins. When a new row arrives on the left side, the engine probes the right-side state index (keyed by join column). When a new row arrives on the right side, it probes the left-side state index. Matched pairs are emitted downstream.

In Flink, both state indices live in RocksDB on local TaskManager disks. Checkpoints periodically snapshot the RocksDB state to remote storage (S3 or HDFS). If a TaskManager fails, the job recovers from the last checkpoint and replays from Kafka to rebuild in-flight state.

In RisingWave, state is written incrementally to Hummock from the moment data arrives. There is no local RocksDB to manage. Compute nodes are stateless: they read from and write to Hummock continuously. A compute node failure does not require replaying Kafka; the engine simply resumes from the last consistent checkpoint snapshot in object storage.

Syntax Comparison

AspectFlink SQLRisingWave SQL
SyntaxStandard JOIN ... ONStandard JOIN ... ON inside CREATE MATERIALIZED VIEW
State backendRocksDB on local disk (or ForSt on S3 with Flink 2.0)Hummock (LSM-tree on S3)
State growthUnbounded without TTL configUnbounded without time condition
State management configidle.state.retention in TableConfigNo extra config, but interval join recommended for time-bounded use cases
RetractionsSupportedSupported

Stream-Table Joins (LEFT JOIN): Preserving Unmatched Events

A left join in streaming is semantically the same as in batch: every row from the left side appears in the output, matched or not. Unmatched rows have NULL on the right-side columns. The key difference in streaming is that both the left side and the right side can be updated, so the engine must be prepared to emit retractions when a dimension table row changes.

-- Flink SQL: LEFT JOIN for stream enrichment
-- Note: this is a regular join, not a temporal join
-- Both sides maintain state
SELECT
    e.event_id,
    e.user_id,
    u.username,
    u.plan,
    e.event_type,
    e.event_time
FROM flink_events AS e
LEFT JOIN flink_users AS u
    ON e.user_id = u.user_id;

In Flink, this LEFT JOIN maintains state for both sides. When a user record is inserted or updated in flink_users, the engine recomputes the join for all matching events in the left-side state buffer and emits retractions followed by corrected output. This is correct but expensive because the left-side state must be kept for all events that could be affected by future right-side updates.

In RisingWave SQL

CREATE MATERIALIZED VIEW jn_mv_left_join AS
SELECT
    e.event_id,
    e.user_id,
    u.username,
    u.plan,
    e.event_type,
    e.event_time
FROM jn_events_a AS e
LEFT JOIN jn_users AS u
    ON e.user_id = u.user_id;

Verified output:

 event_id | user_id | username |    plan    | event_type
----------+---------+----------+------------+------------
     1001 |       1 | alice    | pro        | login
     1002 |       2 | bob      | free       | login
     1003 |       1 | alice    | pro        | purchase
     1004 |       3 | carol    | enterprise | export
     1005 |      99 |          |            | login
(5 rows)

Event 1005 has user_id = 99, which does not exist in jn_users. RisingWave correctly preserves the row with NULL username and plan. If user 99 is later inserted into jn_users, RisingWave emits a retraction of the NULL row and replaces it with the enriched version.

When to Choose LEFT JOIN vs Temporal JOIN

Use a temporal join (with FOR SYSTEM_TIME AS OF PROCTIME()) when:

  • You only want the current value of the dimension at processing time.
  • You do not need historical recomputation when the dimension updates.
  • You want to minimize state cost.

Use a regular LEFT JOIN when:

  • You need the join result to update when the dimension table changes.
  • You want previously unmatched rows to become matched retroactively.
  • You are building a materialized view that must stay consistent with the current state of both sides.

Interval Joins: Time-Bounded Stream Correlation

An interval join matches rows from two streams where the timestamps on both sides fall within a specified time range. Unlike an unbounded stream-stream join, the engine can safely discard state once the time window has passed, keeping memory footprint predictable.

Interval joins are the right tool for correlating related events that happen close together in time: a trade matched with a quote within two minutes, a click matched with a conversion within one hour, an alarm matched with a diagnostic event within thirty seconds.

Flink implements interval joins using watermark-based event-time bounds. Both sides must declare watermarks:

-- Flink SQL: interval join between trades and quotes
-- Both tables must declare watermarks
CREATE TABLE flink_trades_ts (
    trade_id   BIGINT,
    symbol     STRING,
    quantity   INT,
    trade_time TIMESTAMP(3),
    WATERMARK FOR trade_time AS trade_time - INTERVAL '5' SECOND
) WITH ( /* connector config */ );

CREATE TABLE flink_quotes_ts (
    quote_id   BIGINT,
    symbol     STRING,
    bid_price  DECIMAL(10, 2),
    ask_price  DECIMAL(10, 2),
    quote_time TIMESTAMP(3),
    WATERMARK FOR quote_time AS quote_time - INTERVAL '5' SECOND
) WITH ( /* connector config */ );

-- The interval condition must reference both watermark columns
SELECT
    t.trade_id,
    t.symbol,
    t.quantity,
    q.bid_price,
    q.ask_price,
    t.trade_time,
    q.quote_time
FROM flink_trades_ts AS t, flink_quotes_ts AS q
WHERE t.symbol = q.symbol
  AND q.quote_time BETWEEN t.trade_time - INTERVAL '2' MINUTE
                       AND t.trade_time + INTERVAL '2' MINUTE;

Flink's interval join optimizer recognizes the BETWEEN condition on time columns and automatically bounds state retention on both sides. The lower bound on state is the minimum timestamp still inside any active window. State older than the interval is garbage collected.

In RisingWave SQL

RisingWave uses a BETWEEN condition on timestamp columns, with no watermark configuration required at the DDL level. The streaming engine infers the time bound from the BETWEEN clause and manages state expiry automatically:

CREATE MATERIALIZED VIEW jn_mv_interval AS
SELECT
    t.trade_id,
    t.symbol,
    t.quantity,
    q.bid_price,
    q.ask_price,
    t.trade_time,
    q.quote_time,
    EXTRACT(EPOCH FROM (q.quote_time - t.trade_time))::INT AS seconds_apart
FROM jn_trades AS t
JOIN jn_quotes AS q
    ON t.symbol = q.symbol
    AND q.quote_time BETWEEN t.trade_time - INTERVAL '2 minutes'
                         AND t.trade_time + INTERVAL '2 minutes';

Verified output:

 trade_id | symbol | quantity | bid_price | ask_price | seconds_apart
----------+--------+----------+-----------+-----------+---------------
        1 | AAPL   |      100 |    174.50 |    174.55 |            30
        2 | GOOG   |       50 |   2820.00 |   2820.10 |            30
        3 | AAPL   |      200 |    174.50 |    174.55 |           -90
        3 | AAPL   |      200 |    174.60 |    174.65 |            30
        4 | MSFT   |       75 |    415.20 |    415.25 |            30
(5 rows)

AAPL trade 3 matches both quotes: the first AAPL quote (90 seconds before the trade, within the 2-minute window) and the second (30 seconds after). The seconds_apart column shows the time difference. This is the correct semantic for an interval join: any matching quote within the two-minute window on either side of the trade time is included.

Under the Hood

Both systems use a similar approach for interval joins: each incoming event is stored in state only for the duration of the interval window. Once an event's timestamp is older than the window, it is eligible for cleanup. Flink uses watermark advancement to trigger this cleanup; RisingWave uses its internal barrier mechanism.

The difference is in how state is stored and cleaned up. Flink partitions RocksDB state by join key, with TTL enforced by periodic compaction triggered by watermarks. RisingWave stores state in Hummock and uses a time-to-live index alongside the join key index. When the barrier advances past the window boundary, Hummock deletes the expired state records during background compaction.

Syntax Comparison

AspectFlink SQLRisingWave SQL
Watermark requiredYes, on both sidesNo
Time conditionBETWEEN on watermark columnsBETWEEN on timestamp columns
State boundAutomatic, based on interval and watermarksAutomatic, based on interval
State backendRocksDB with TTLHummock with time-based expiry

Cascading Joins: Building Multi-Stage Pipelines

One of RisingWave's distinctive capabilities is cascading materialized views: a materialized view that reads from another materialized view. This lets you compose complex multi-stage pipelines entirely in SQL.

Here is a two-stage pipeline. The first stage enriches orders with product data via a temporal join. The second stage reads from the first materialized view and computes per-category revenue aggregates:

-- Stage 1: Temporal join to enrich orders with product details
CREATE MATERIALIZED VIEW jn_mv_enriched_orders AS
SELECT
    o.order_id,
    o.order_time,
    p.product_name,
    p.category,
    p.price AS unit_price,
    o.quantity,
    o.quantity * p.price AS total_revenue
FROM jn_orders AS o,
     jn_products FOR SYSTEM_TIME AS OF PROCTIME() AS p
WHERE o.product_id = p.product_id;

-- Stage 2: Aggregate enriched orders by category
CREATE MATERIALIZED VIEW jn_mv_category_summary AS
SELECT
    category,
    COUNT(*)           AS order_count,
    SUM(quantity)      AS total_units,
    SUM(total_revenue) AS total_revenue
FROM jn_mv_enriched_orders
GROUP BY category;

Verified output from the second-stage view:

  category   | order_count | total_units | total_revenue
-------------+-------------+-------------+---------------
 Electronics |           3 |           8 |       4046.95
 Furniture   |           2 |           2 |       1348.00
(2 rows)

Both views update continuously as new orders arrive in jn_orders or as product prices change in jn_products. You query either view with a regular SELECT and always get fresh, consistent results.

Building an equivalent pipeline in Flink requires a stateful job for stage 1, another job for stage 2, a Kafka topic between them (or a shared state store), and operational tooling to manage both jobs. In RisingWave, the pipeline is two SQL statements. The engine handles all the incremental computation, state management, and consistency between stages internally.

Configuration Complexity: A Direct Comparison

Beyond syntax, the operational burden of configuring join pipelines differs significantly between the two systems.

  1. Define watermarks on the stream source table.
  2. Configure the dimension table connector with the right lookup parameters (lookup.cache.max-rows, lookup.cache.ttl, lookup.max-retries).
  3. Set state.backend in the Flink job configuration.
  4. Configure checkpoint interval and storage location.
  5. Set table.exec.state.ttl to prevent unbounded state growth on non-temporal joins.
  6. Deploy JobManager and TaskManagers with appropriate memory settings.
  7. Monitor RocksDB compaction and checkpoint duration separately.

RisingWave Configuration for the Same Join

  1. Write the CREATE MATERIALIZED VIEW SQL.
  2. Connect with psql or any PostgreSQL client.

RisingWave's storage (Hummock), checkpointing, and state management are handled by the engine. There is no RocksDB to tune, no checkpoint storage to configure separately (it uses the same S3 backend as all state), and no per-connector lookup cache to size.

Feature Comparison Matrix

Join TypeFlink SQLRisingWave SQL
Stream-stream INNER JOINSupportedSupported
Stream-stream LEFT JOINSupportedSupported
Stream-stream FULL OUTER JOINSupportedSupported
Temporal join (PROCTIME)SupportedSupported
Temporal join (ROWTIME event-time)SupportedSupported (via FOR SYSTEM_TIME AS OF PROCTIME())
Interval joinSupportedSupported
Lookup join with external storeSupported (JDBC, HBase, Redis)Via FOR SYSTEM_TIME AS OF PROCTIME() on internal tables
Cascading joins across MVsNot natively (requires Kafka between jobs)Supported natively
Multi-way joins in one querySupportedSupported
MATCH_RECOGNIZE after joinSupportedNot supported
Custom join operator (DataStream API)SupportedNot applicable

For a full feature comparison, see the RisingWave vs Flink feature comparison matrix.

When to Choose Each System

Choose RisingWave when:

  • Your team writes SQL and wants joins that look and behave like PostgreSQL joins.
  • You are building streaming enrichment pipelines (temporal joins plus aggregation) and want minimal operational overhead.
  • You want to compose multi-stage pipelines through cascading materialized views without managing inter-stage Kafka topics.
  • State management complexity is a limiting factor: your team spends time tuning RocksDB, sizing checkpoints, or debugging state retention.
  • You want to query join results directly via SQL without a separate serving layer.

Choose Flink when:

  • You need MATCH_RECOGNIZE for complex event processing patterns.
  • Your pipeline requires custom Java operators that encode business logic the SQL layer cannot express.
  • You have existing Flink infrastructure and connectors that cover integrations RisingWave does not yet support.
  • You need event-time temporal joins that look up data at the time the event was produced (not at processing time).

For a broader architectural comparison, see Apache Flink vs RisingWave: A Practical Comparison for 2026.

FAQ

Does RisingWave support event-time temporal joins like Flink?

RisingWave's FOR SYSTEM_TIME AS OF PROCTIME() is a processing-time temporal join. It looks up the dimension table at the moment the event is processed by the streaming engine, not at the event's original production timestamp. Flink supports both processing-time and event-time temporal joins; event-time temporal joins use a versioned table or a changelog stream as the right side and look up the dimension at the event's rowtime. RisingWave does not yet support event-time temporal joins. For most enrichment workloads, processing-time lookups are sufficient and have lower latency and state cost.

How does RisingWave handle late events in interval joins?

RisingWave uses an internal barrier mechanism to advance processing time and expire interval join state. Events that arrive after the interval window has closed are not matched against expired state. Unlike Flink, you do not need to declare a watermark or configure a watermark lag to control late event tolerance. The BETWEEN clause in the join condition defines the window; events outside that window are simply not matched. If late event tolerance is critical, consider using a wider interval window.

Can I join more than two streams in a single RisingWave materialized view?

Yes. RisingWave supports multi-way joins within a single CREATE MATERIALIZED VIEW statement. You can join three or more tables using a combination of join types. For example, you can do an inner join between two event streams and a temporal join with a dimension table, all in one query. Each join adds state and compute cost, so monitor resource utilization as you add join stages. The RisingWave join documentation covers the full syntax and semantics for multi-way joins.

What happens to a stream-stream join result when one side is updated or deleted?

Both Flink and RisingWave support retractions in stream-stream joins. When a row is updated or deleted on one side of a join, the engine emits a retraction (a negation of the previously emitted output) followed by a new output row reflecting the updated data. In RisingWave, retractions propagate automatically through cascading materialized views, so downstream aggregations and views stay consistent. In Flink, retractions propagate through the operator DAG and are handled by downstream operators that support changelog input. For more on how RisingWave handles streaming state and watermarks, see the watermarks documentation.

Summary

Both Apache Flink and RisingWave handle the full range of streaming join patterns: temporal joins, stream-stream joins, left joins, and interval joins. The mechanics under the hood are similar in both systems: hash-based state on both sides for stream-stream joins, key-based lookup for temporal joins, and time-bounded state for interval joins.

The meaningful differences are in the developer experience and the operational model.

Flink requires explicit watermark declarations, connector configurations, state backend settings, and often additional infrastructure (Kafka topics between pipeline stages) to build the same pipelines that RisingWave expresses in a few SQL statements. RisingWave treats joins as first-class SQL operations, stores all state in its built-in object storage layer, and lets you query results directly without a separate serving system.

For teams that want to write streaming join pipelines in SQL without managing the infrastructure that surrounds them, RisingWave's approach is meaningfully simpler. For teams that need the full flexibility of Flink's DataStream API or event-time temporal joins, Flink remains the more capable platform.

Get started with joins in RisingWave today. Quickstart guide

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.