Configurable Join Cache Eviction in RisingWave Streaming Joins

RisingWave's join_cache_evict_interval_rows setting lets you control how often the in-memory join cache evicts stale entries during streaming join execution. By tuning this parameter per streaming job, you can reduce unnecessary spills to state backend storage, lower memory pressure, and improve throughput for joins with uneven cardinality or tight memory budgets.

What Is Join Cache Eviction in Streaming Joins?

RisingWave executes streaming joins incrementally. As new rows arrive on either side of a join, RisingWave must match them against rows seen on the other side. To avoid re-reading from the persistent state backend on every row, it maintains an in-memory cache of recently accessed join state.

That cache cannot grow forever. Eviction is the process of removing entries from the in-memory cache so they can be written back to the state backend, freeing up memory for new entries. Every eviction round trips through storage, so eviction frequency directly affects latency and throughput.

The Problem with a Fixed Eviction Interval

Before this feature, the eviction interval was a compile-time constant. Every join executor evicted cache entries at the same fixed row interval regardless of whether the join was over a small lookup table or a high-volume event stream.

This one-size-fits-all policy caused two opposite failure modes. For joins over large tables with high cardinality, the cache filled quickly but eviction was infrequent, leading to unexpectedly large memory footprints. For joins over small tables, eviction fired far too often, flushing entries that were still hot and triggering unnecessary state backend reads on the next match.

Neither failure mode is obvious until you profile memory usage or observe degraded throughput under load.

Introducing `join_cache_evict_interval_rows`

RisingWave now exposes join_cache_evict_interval_rows as a configurable parameter at the streaming job level. It controls the number of rows processed between each eviction sweep of the join cache.

A smaller value means more frequent eviction: the cache stays smaller but state backend I/O increases. A larger value means less frequent eviction: the cache grows larger but state backend I/O decreases.

You can set it per session before creating a streaming job, scoping the tuning to the specific job without affecting other running jobs.

How to Configure It

Set the parameter using a standard SET statement before issuing your CREATE MATERIALIZED VIEW or CREATE SINK statement:

-- Increase eviction interval for a high-cardinality join
-- Default is 1024 rows; increase to reduce state backend I/O
SET join_cache_evict_interval_rows = 8192;

CREATE MATERIALIZED VIEW order_enriched AS
SELECT
    o.order_id,
    o.amount,
    u.name,
    u.region
FROM orders o
JOIN users u ON o.user_id = u.user_id;

-- Decrease eviction interval for a memory-constrained environment
SET join_cache_evict_interval_rows = 256;

CREATE MATERIALIZED VIEW session_events AS
SELECT
    e.event_id,
    s.session_start,
    e.event_type
FROM events e
JOIN sessions s ON e.session_id = s.session_id;

The setting takes effect for any streaming executor created within the session. It does not retroactively affect existing materialized views.

When to Increase the Value

Increase join_cache_evict_interval_rows when you observe that a join is generating high state backend read amplification or when throughput is lower than expected despite available memory.

Concretely, consider increasing it when:

The join is over a large, slowly-changing dimension table and most cache entries are reused across many input rows.
You have profiled memory usage and the join cache is well within budget.
Metrics show a high ratio of state backend reads to rows processed, indicating frequent cache misses caused by premature eviction.

A good starting point is doubling the default and observing whether throughput improves without memory usage exceeding your budget.

When to Decrease the Value

Decrease join_cache_evict_interval_rows when a streaming job is consuming more memory than expected or when you are running multiple joins in a single job and need to share memory across them.

Concretely, consider decreasing it when:

The join is over a high-cardinality table where cache entries are rarely reused.
Memory usage for the job is approaching the operator's memory limit.
You are co-locating many streaming jobs on the same nodes and need predictable per-job memory bounds.

Decreasing the value too aggressively will increase state backend read pressure, so always validate with throughput metrics after tuning.

Practical Tuning Workflow

Start with the default value and observe two metrics: join cache memory usage and state backend read rate for the join operator. If memory is high and reads are low, the cache is working efficiently but may be too large, so decrease the interval. If reads are high relative to rows processed, the cache is evicting too aggressively, so increase the interval.

RisingWave's built-in metrics exposed via Prometheus and Grafana include per-operator cache hit and miss counters, which make this workflow straightforward.

FAQ

What is the default value of join_cache_evict_interval_rows? The default is 1024 rows. This value worked well as a fixed constant across many workloads but is not optimal for joins with unusual cardinality or memory constraints.

Does changing this setting affect existing materialized views? No. The parameter is applied at job creation time. Existing materialized views continue using the value that was in effect when they were created. To apply a new value to an existing job, you would need to drop and recreate it.

Is this parameter stored as part of the job definition? Yes. When you create a materialized view or sink with this parameter set, RisingWave stores the effective value with the job. The value persists across restarts and recoveries.

Can I set different values for different joins within the same materialized view? The parameter applies at the session level and affects all joins in the streaming job created within that session. Per-operator overrides are not currently supported.

What happens if I set the value too low? Setting the value very low, such as 1 or 10, will cause the join cache to evict almost immediately after every row. This effectively disables caching and forces every join match to read from the state backend, which will significantly reduce throughput. Always keep the value high enough that typical working set entries survive at least one full pass through the cache.