Streaming Data Governance: Schema Registry, Lineage, and Access Control
High availability (HA) in stream processing ensures continuous operation during node failures, network partitions, and planned maintenance. The key patterns are active-standby (failover), active-active (parallel processing), and cross-region (disaster recovery).
HA Patterns
| Pattern | Downtime | Data Loss | Cost | Complexity |
| Active-standby | Seconds (failover) | Checkpoint interval | 2x compute | Low |
| Active-active | Zero | Near-zero | 2x+ compute | High |
| Cross-region | Minutes | Cross-region lag | 2x+ everything | High |
RisingWave HA
RisingWave's disaggregated architecture provides built-in HA:
- State on S3: 11 nines durability. No state loss on node failure.
- 1-second checkpoints: Maximum 1 second of data loss.
- Seconds-level recovery: New nodes read state from S3 immediately.
- Elastic scaling: Add/remove compute without state migration.
Frequently Asked Questions
What is the fastest recovery time for streaming systems?
RisingWave and Flink 2.0 (ForSt) recover in seconds regardless of state size due to disaggregated S3 state. Kafka Streams with standby replicas also recovers in seconds but requires 2x resources.

