Exactly-Once Delivery in Kafka: How Transactions Work
Backpressure occurs when a downstream operator in a stream processing pipeline cannot keep up with the rate of incoming events, causing the entire pipeline to slow down. It's a flow control mechanism that prevents data loss but degrades throughput and increases latency.
How Backpressure Propagates
Source (1000 evt/s) → Transform (1000 evt/s) → Aggregate (500 evt/s MAX)
↑ BOTTLENECK
Backpressure propagates upstream
Source slows to 500 evt/s
Common Causes
| Cause | Symptom | Fix |
| Slow operator | One stage has high latency | Optimize or parallelize |
| Insufficient resources | CPU/memory maxed | Scale up/out |
| State too large | RocksDB compaction stalls | Use S3 state, increase memory |
| Skewed partitions | One partition overloaded | Better partition key |
| External calls | Blocking I/O in processing | Async I/O, batch calls |
Detecting Backpressure
- Flink: Backpressure metrics in Flink UI (percentage per operator)
- Kafka Streams: Consumer lag increasing over time
- RisingWave: Barrier latency increasing, checkpoint duration growing
Fixing Backpressure
- Scale the bottleneck — add parallelism to the slow operator
- Optimize the slow operation — reduce computation, cache lookups
- Filter early — drop unneeded events before the bottleneck
- Use async I/O — don't block on external calls
- Upgrade state backend — S3-based state (RisingWave, Flink ForSt) avoids local disk bottlenecks
Frequently Asked Questions
Is backpressure bad?
Backpressure itself is a healthy flow control mechanism — it prevents data loss by slowing the pipeline to match the slowest operator. The underlying cause (bottleneck) needs fixing, not the backpressure mechanism.
How does RisingWave handle backpressure?
RisingWave uses barrier-based flow control. When downstream operators are slow, barriers propagate back to sources, naturally throttling ingestion. S3-based state avoids disk I/O bottlenecks that cause backpressure in RocksDB-based systems.

