Every data engineer has faced this problem: your batch pipeline computes hourly aggregations overnight, but the business needs results in seconds. A customer dispute needs the last five transactions grouped by session. A product team wants page views per minute, updated continuously. An ops team needs rolling error rates across sliding intervals.
Window functions are the answer. In traditional SQL, window functions like ROW_NUMBER(), LAG(), and SUM() OVER(...) let you compute values across related rows without collapsing them into a single output. In streaming SQL, time-based windows (tumbling, hopping, session) let you slice an unbounded data stream into finite chunks for aggregation.
This tutorial walks you through both categories of window functions in streaming SQL, with complete, runnable examples using RisingWave. By the end, you will know when to reach for each window type and how to wire them into materialized views that stay fresh as new data arrives.
What Are Window Functions in Streaming SQL?
Window functions operate on a set of rows related to the current row and return a single value for each input row. Unlike GROUP BY, which collapses rows, window functions preserve the original row count while adding computed columns.
In streaming SQL, there are two distinct families of window functions:
Analytical window functions -
ROW_NUMBER(),RANK(),LAG(),LEAD(),FIRST_VALUE(),LAST_VALUE(), and aggregate functions used with anOVER()clause. These work on partitions and orderings you define, similar to standard SQL.Time window functions -
TUMBLE(),HOP(), andSESSION. These are specific to stream processing. They slice a continuous data stream into bounded time intervals so you can run aggregations over finite chunks.
The key difference from batch SQL: in a streaming database like RisingWave, these window functions run continuously. When you wrap them in a materialized view, results update incrementally as each new event arrives, without re-scanning the entire dataset.
How Do Tumbling Windows Work?
A tumbling window divides a stream into fixed-size, non-overlapping intervals. Every event belongs to exactly one window. When the window closes, the aggregation result is finalized and a new window begins.
Think of it as cutting a timeline into equal slices: 0:00-0:05, 0:05-0:10, 0:10-0:15, and so on.
When to use tumbling windows
- Counting events per fixed interval (orders per minute, logins per hour)
- Computing periodic summaries where no overlap is needed
- Building dashboards that refresh on regular cadences
Tumbling window syntax in RisingWave
SELECT [columns]
FROM TUMBLE(table_or_source, time_col, window_size [, offset]);
The TUMBLE() function adds two columns to each row: window_start and window_end.
Example: orders per minute
Let's build a real-time order throughput monitor. First, create a source that ingests order events from Kafka:
CREATE SOURCE orders_stream (
order_id BIGINT,
user_id BIGINT,
amount DECIMAL,
created_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'orders',
properties.bootstrap.server = 'localhost:9092'
) FORMAT PLAIN ENCODE JSON;
Now create a materialized view that counts orders and sums revenue per one-minute tumbling window:
CREATE MATERIALIZED VIEW orders_per_minute AS
SELECT
window_start,
window_end,
COUNT(*) AS order_count,
SUM(amount) AS total_revenue
FROM TUMBLE(orders_stream, created_at, INTERVAL '1 MINUTE')
GROUP BY window_start, window_end;
Query the results at any time:
SELECT * FROM orders_per_minute ORDER BY window_start DESC LIMIT 5;
Expected output:
window_start | window_end | order_count | total_revenue
-------------------------+-------------------------+-------------+--------------
2026-03-29 14:04:00+00 | 2026-03-29 14:05:00+00 | 42 | 3847.50
2026-03-29 14:03:00+00 | 2026-03-29 14:04:00+00 | 38 | 3210.75
2026-03-29 14:02:00+00 | 2026-03-29 14:03:00+00 | 45 | 4102.00
2026-03-29 14:01:00+00 | 2026-03-29 14:02:00+00 | 33 | 2955.25
2026-03-29 14:00:00+00 | 2026-03-29 14:01:00+00 | 40 | 3680.00
Each row represents a distinct one-minute interval. No event is counted twice. The materialized view updates automatically as new orders arrive, so your dashboard always shows the latest minute-by-minute breakdown.
How Do Hopping Windows Differ from Tumbling Windows?
A hopping window (also called a sliding window) has a fixed size but advances by a smaller step. This means windows overlap, and a single event can appear in multiple windows.
For example, a 5-minute window that hops every 1 minute produces overlapping intervals: 0:00-0:05, 0:01-0:06, 0:02-0:07, and so on.
When to use hopping windows
- Smoothing out spikes with rolling averages (5-minute moving average updated every 30 seconds)
- Detecting trends where you need overlapping context
- Building SLO monitors that need continuous coverage
Hopping window syntax in RisingWave
SELECT [columns]
FROM HOP(table_or_source, time_col, hop_size, window_size [, offset]);
The hop_size controls how frequently a new window starts. The window_size controls how wide each window is.
Example: rolling 5-minute error rate
Suppose you have an API logs stream and want to track the error rate over 5-minute windows, updated every minute:
CREATE SOURCE api_logs (
request_id VARCHAR,
endpoint VARCHAR,
status_code INT,
response_ms INT,
logged_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'api_logs',
properties.bootstrap.server = 'localhost:9092'
) FORMAT PLAIN ENCODE JSON;
CREATE MATERIALIZED VIEW rolling_error_rate AS
SELECT
window_start,
window_end,
COUNT(*) AS total_requests,
COUNT(*) FILTER (WHERE status_code >= 500) AS error_count,
ROUND(
COUNT(*) FILTER (WHERE status_code >= 500)::DECIMAL / COUNT(*) * 100, 2
) AS error_rate_pct
FROM HOP(api_logs, logged_at, INTERVAL '1 MINUTE', INTERVAL '5 MINUTES')
GROUP BY window_start, window_end;
Expected output:
window_start | window_end | total_requests | error_count | error_rate_pct
-------------------------+-------------------------+----------------+-------------+---------------
2026-03-29 14:01:00+00 | 2026-03-29 14:06:00+00 | 1250 | 15 | 1.20
2026-03-29 14:00:00+00 | 2026-03-29 14:05:00+00 | 1180 | 12 | 1.02
2026-03-29 13:59:00+00 | 2026-03-29 14:04:00+00 | 1320 | 18 | 1.36
Notice how windows overlap: the 14:01-14:06 window and the 14:00-14:05 window share four minutes of data. This overlap gives you a smoother, more responsive error rate that catches spikes faster than non-overlapping tumbling windows.
How Do Session Windows Group Activity?
Session windows are dynamic. Instead of fixed time boundaries, a session window groups events that arrive within a specified gap of each other. If no event arrives within the gap duration, the window closes.
This makes session windows ideal for user activity analysis, where you want to group a burst of clicks or actions into a single "session" without knowing in advance how long each session will last.
Session window syntax in RisingWave
In RisingWave, session windows use the SESSION frame clause in a window function's OVER specification:
window_function() OVER (
PARTITION BY column
ORDER BY time_col
SESSION WITH GAP INTERVAL 'duration'
)
Note: Session windows in RisingWave are currently supported in batch mode and in emit-on-window-close streaming mode.
Example: user session analysis
Track user sessions on a website, where a session ends after 30 minutes of inactivity:
CREATE TABLE page_views (
user_id BIGINT,
page_url VARCHAR,
viewed_at TIMESTAMPTZ
);
-- Identify session boundaries in batch mode
SELECT
user_id,
page_url,
viewed_at,
COUNT(*) OVER (
PARTITION BY user_id
ORDER BY viewed_at
SESSION WITH GAP INTERVAL '30 MINUTES'
) AS pages_in_session,
FIRST_VALUE(viewed_at) OVER (
PARTITION BY user_id
ORDER BY viewed_at
SESSION WITH GAP INTERVAL '30 MINUTES'
) AS session_start,
LAST_VALUE(viewed_at) OVER (
PARTITION BY user_id
ORDER BY viewed_at
SESSION WITH GAP INTERVAL '30 MINUTES'
) AS session_end
FROM page_views
ORDER BY user_id, viewed_at;
This query assigns each page view to a session based on activity gaps. If a user is inactive for more than 30 minutes, their next page view starts a new session.
How Can You Use Analytical Window Functions in Streams?
Beyond time-based windows, streaming SQL supports the standard analytical window functions you already know from PostgreSQL. In RisingWave, these work inside materialized views and update incrementally.
ROW_NUMBER for deduplication
A common streaming challenge is deduplicating events. Use ROW_NUMBER() to keep only the latest event per key:
CREATE MATERIALIZED VIEW latest_sensor_reading AS
SELECT sensor_id, temperature, humidity, recorded_at
FROM (
SELECT
sensor_id,
temperature,
humidity,
recorded_at,
ROW_NUMBER() OVER (
PARTITION BY sensor_id
ORDER BY recorded_at DESC
) AS rn
FROM sensor_readings
)
WHERE rn = 1;
This materialized view always reflects the most recent reading per sensor, updating automatically as new readings stream in.
LAG for change detection
Detect sudden temperature spikes by comparing each reading to the previous one:
CREATE MATERIALIZED VIEW temperature_spikes AS
SELECT
sensor_id,
temperature,
recorded_at,
LAG(temperature) OVER (
PARTITION BY sensor_id
ORDER BY recorded_at
) AS prev_temperature,
temperature - LAG(temperature) OVER (
PARTITION BY sensor_id
ORDER BY recorded_at
) AS temp_change
FROM sensor_readings;
You can then query for anomalies:
SELECT * FROM temperature_spikes
WHERE ABS(temp_change) > 10
ORDER BY recorded_at DESC;
Running aggregates with SUM OVER
Track cumulative revenue per customer within a day:
CREATE MATERIALIZED VIEW daily_cumulative_revenue AS
SELECT
user_id,
order_id,
amount,
created_at,
SUM(amount) OVER (
PARTITION BY user_id, created_at::DATE
ORDER BY created_at
ROWS UNBOUNDED PRECEDING
) AS cumulative_daily_spend
FROM orders;
All aggregate functions in RisingWave, including SUM(), AVG(), MIN(), MAX(), and COUNT(), can be used as window functions by adding an OVER() clause.
Which Window Type Should You Choose?
Picking the right window depends on your use case. Here is a quick reference:
| Window Type | Size | Overlap | Best For |
| Tumbling | Fixed | None | Periodic reports, counters, billing intervals |
| Hopping | Fixed | Yes | Rolling averages, trend smoothing, SLO tracking |
| Session | Dynamic | None | User sessions, activity bursts, IoT device sessions |
| Analytical (OVER) | Row-based | N/A | Rankings, dedup, change detection, running totals |
Decision rules
- Need non-overlapping counts or sums on a regular cadence? Use tumbling.
- Need a moving average or rolling metric? Use hopping. Set the hop size to your desired update frequency and the window size to your lookback period.
- Need to group events by activity gaps? Use session.
- Need row-level comparisons (previous value, rank)? Use analytical window functions with
OVER().
You can also combine them. For example, use a tumbling window to compute per-minute aggregates, then apply LAG() over those aggregates to detect minute-over-minute changes:
CREATE MATERIALIZED VIEW order_trend AS
SELECT
window_start,
order_count,
LAG(order_count) OVER (ORDER BY window_start) AS prev_minute_count,
order_count - LAG(order_count) OVER (ORDER BY window_start) AS count_change
FROM orders_per_minute;
FAQ
What is the difference between a tumbling window and a hopping window?
A tumbling window has a fixed size and does not overlap, so each event belongs to exactly one window. A hopping window also has a fixed size but advances by a smaller step, creating overlapping windows where a single event can appear in multiple windows. Use tumbling for distinct interval counts, and hopping for smoothed rolling metrics.
Can I use standard SQL window functions like ROW_NUMBER in streaming?
Yes. RisingWave supports standard analytical window functions including ROW_NUMBER(), RANK(), DENSE_RANK(), LAG(), LEAD(), FIRST_VALUE(), and LAST_VALUE(). You can use them inside materialized views, and the results update incrementally as new data arrives. All aggregate functions also work as window functions when combined with an OVER() clause.
How does RisingWave handle late-arriving events in time windows?
RisingWave uses watermarks to handle event-time processing and late arrivals. You define a watermark on your source to specify how much lateness to tolerate. Events that arrive within the watermark threshold are included in the correct window; events that arrive after the watermark has advanced past their window are dropped. This prevents indefinitely holding window state while still accommodating reasonable delays.
When should I use session windows instead of tumbling windows?
Use session windows when your data has natural bursts of activity separated by idle periods, and you want each burst grouped together regardless of its duration. Common examples include user browsing sessions, IoT device communication bursts, and call center interactions. Tumbling windows are better when you need uniform, predictable time intervals for reporting or billing.
Conclusion
Window functions are the backbone of real-time analytics in streaming SQL. Here are the key takeaways:
- Tumbling windows give you clean, non-overlapping intervals for periodic aggregation.
- Hopping windows provide smoothed, overlapping metrics for trend detection and monitoring.
- Session windows dynamically group bursts of activity separated by idle gaps.
- Analytical window functions (
ROW_NUMBER,LAG,LEAD, running aggregates) work on row-level partitions and enable deduplication, change detection, and rankings. - In RisingWave, wrapping any of these in a materialized view makes the computation continuous and incremental, so results are always fresh without re-running batch queries.
Ready to try this yourself? Get started with RisingWave in 5 minutes. Quickstart
Join our Slack community to ask questions and connect with other stream processing developers.

