Exactly-Once Semantics is the strongest processing guarantee in distributed systems, particularly in stream processing. It ensures that each input event (or message) is processed and its effects are reflected in the system's state and output exactly one time, even in the presence of failures (like node crashes, network issues, or restarts).
This guarantee is crucial for applications where duplicate processing or data loss would lead to incorrect results or inconsistent state, such as financial transactions, critical alerting, or maintaining accurate counts and aggregations.
Achieving true end-to-end exactly-once semantics (from source, through the processor, to the sink) is complex due to potential failures at various stages:
Exactly-once processing within the stream processor itself is typically achieved through robust Checkpointing or transaction mechanisms:
RisingWave provides exactly-once semantics for its internal state management through its distributed, consistent Checkpointing mechanism built on the Hummock state store.
Therefore, while RisingWave guarantees exactly-once for its internal state, users must select and configure appropriate sink connectors to achieve end-to-end exactly-once semantics for their specific pipeline.