The Kappa Architecture is a software architecture pattern for data processing that aims to handle both real-time and historical (batch) data analysis using a single stream processing engine and an immutable log as the source of truth. It was proposed as a simpler alternative to the more complex Lambda Architecture, which uses separate batch and speed layers.
The core idea of the Kappa Architecture is to treat all data as a stream and use a stream processing system to perform all computations. If historical data needs to be reprocessed (e.g., due to a bug fix in the processing logic or a new analytical requirement), the stream processing job is simply rerun from the beginning of the immutable log.
The Lambda Architecture consists of three layers:
Kappa Architecture Simplifies Lambda by:
RisingWave aligns well with the principles of the Kappa Architecture:
By using RisingWave with a durable event log like Kafka or Pulsar, users can implement a Kappa-style architecture to unify their real-time and historical data processing needs with a consistent SQL-based approach.