A Data Stream, often used interchangeably with Event Stream, refers to an unbounded, potentially infinite sequence of data records (events) ordered in time. Unlike traditional batch datasets which are finite and processed at rest, data streams represent data that is continuously generated and needs to be processed 'in motion' as it arrives.
Streams are the fundamental input and output for Stream Processing systems. Examples of data streams include:
Data streams are often materialized or managed by intermediary systems:
Stream processing engines like RisingWave, Apache Flink, or Spark Streaming are designed to consume, transform, and analyze data streams:
Data streams are the primary input and a potential output for RisingWave:
RisingWave effectively allows users to define transformations and computations on input data streams using SQL, often materializing the results for low-latency querying or sinking the output streams for downstream consumption.