Windowing in Stream Processing: Tumbling, Sliding, Session, and Hopping Windows
Windowing groups continuous streaming data into finite chunks for aggregation. Without windows, you cannot compute "events per minute" or "average over the last 5 minutes" from an infinite stream. The four main window types are tumbling (fixed, non-overlapping), hopping (fixed, overlapping), sliding (continuous), and session (activity-based). All major stream processors — Flink, RisingWave, Kafka Streams, and Spark — support windowing.
Window Types Compared
| Window | Size | Overlap | Trigger | Best For |
| Tumbling | Fixed | None | End of window | Regular reporting (per-minute counts) |
| Hopping | Fixed | Yes | Each hop | Smoothed metrics (5-min avg, every 1 min) |
| Sliding | Fixed | Continuous | Every event | Moving averages, continuous monitoring |
| Session | Variable | None | Inactivity gap | User sessions, activity grouping |
Tumbling Windows
Fixed-size, non-overlapping. Every event belongs to exactly one window.
-- Count events per 1-minute window
CREATE MATERIALIZED VIEW events_per_minute AS
SELECT
window_start,
window_end,
COUNT(*) as event_count,
SUM(amount) as total_amount
FROM TUMBLE(events, event_time, INTERVAL '1 MINUTE')
GROUP BY window_start, window_end;
Time: |----W1----|----W2----|----W3----|
Events: * * * * * * * * * * *
Hopping Windows
Fixed-size windows that advance by a hop interval, creating overlap.
-- 5-minute window, advancing every 1 minute
CREATE MATERIALIZED VIEW rolling_avg AS
SELECT
window_start,
AVG(value) as avg_value,
COUNT(*) as sample_count
FROM HOP(sensor_data, reading_time, INTERVAL '1 MINUTE', INTERVAL '5 MINUTES')
GROUP BY window_start;
Time: |------W1------|
|------W2------|
|------W3------|
Events: * * * * * * * * *
Each event belongs to multiple windows. Useful for smoothed metrics.
Session Windows
Variable-size windows defined by an inactivity gap. A session ends when no events arrive for the specified gap duration.
-- User sessions with 30-minute inactivity timeout
CREATE MATERIALIZED VIEW user_sessions AS
SELECT
user_id,
window_start as session_start,
window_end as session_end,
COUNT(*) as events_in_session,
window_end - window_start as session_duration
FROM SESSION(clickstream, event_time, INTERVAL '30 MINUTES')
GROUP BY user_id, window_start, window_end;
Time: |---Session 1---| (30min gap) |--Session 2--|
Events: * * * * * * * *
Late Data and Watermarks
Events can arrive after their window has closed (network delays, mobile offline sync). Watermarks define how long to wait for late data:
- Watermark: A timestamp threshold. Events older than the watermark are considered "late"
- Allowed lateness: How long after the watermark to accept late events
- Late event handling: Drop, update the window result, or route to a side output
Choosing the Right Window
- Regular metrics (per-minute, per-hour) → Tumbling
- Smoothed trends (rolling average) → Hopping
- Continuous monitoring (always-current last-N-minutes) → Sliding
- User behavior analysis (session duration, pages per visit) → Session
Frequently Asked Questions
What is windowing in stream processing?
Windowing divides an infinite data stream into finite chunks for aggregation. Without windows, you cannot compute bounded aggregations (counts, averages, sums) over streaming data. Windows are defined by time (event time or processing time) and can be fixed-size (tumbling, hopping) or variable-size (session).
What is the difference between tumbling and hopping windows?
Tumbling windows are non-overlapping — each event belongs to exactly one window. Hopping windows overlap — each event belongs to multiple windows. For example, a 5-minute hopping window with 1-minute hops means each event appears in 5 different windows. Use tumbling for distinct reporting periods; use hopping for smoothed rolling metrics.
How do streaming systems handle late-arriving events?
Streaming systems use watermarks to track event-time progress. Events arriving after the watermark are considered "late." Systems can be configured to drop late events, update the already-emitted window result, or route late events to a separate output for manual handling. RisingWave and Flink both support configurable watermark strategies.

