Stream Processing (also known as Event Stream Processing or ESP) is a data processing paradigm that deals with continuous, unbounded sequences of data, often called "data streams" or "event streams." Unlike traditional batch processing, which collects and processes data in large, static groups (batches) after a certain period, stream processing analyzes and acts upon data as it arrives, typically with very low latency (milliseconds to seconds).
The core idea is to process data "in motion" rather than "at rest."
Feature | Stream Processing | Batch Processing |
---|---|---|
Data Scope | Unbounded, continuous streams | Bounded, finite datasets |
Data Model | Data in motion | Data at rest |
Latency | Milliseconds to seconds | Minutes to hours (or longer) |
Analysis | Real-time, continuous analysis | Retrospective, periodic analysis |
Result | Continuously updating results | Results computed once per batch |
Primary Use | Real-time monitoring, alerts, analytics | ETL, historical reporting, complex models |
Specialized systems called stream processing engines (SPEs) or streaming databases are designed to handle the unique challenges of stream processing. Examples include:
RisingWave, as a streaming database, focuses on making stream processing accessible via SQL and providing efficient incremental computation for low-latency, fresh results.