How to Monitor Kafka Consumer Lag in Real Time
Kafka consumer lag — the difference between the latest message offset and the consumer's current offset — indicates how far behind a consumer is. Growing lag means the consumer can't keep up. Streaming SQL can monitor lag alongside application metrics for a unified view of pipeline health.
Consumer Lag Monitoring
-- Ingest consumer lag metrics from a Kafka topic
CREATE SOURCE consumer_metrics (
consumer_group VARCHAR, topic VARCHAR, partition INT,
current_offset BIGINT, end_offset BIGINT, lag BIGINT, ts TIMESTAMP
) WITH (connector='kafka', topic='consumer-lag-metrics', ...);
-- Real-time lag dashboard
CREATE MATERIALIZED VIEW lag_dashboard AS
SELECT consumer_group, topic,
SUM(lag) as total_lag,
MAX(lag) as max_partition_lag,
COUNT(*) FILTER (WHERE lag > 10000) as partitions_behind,
MAX(ts) as last_update
FROM consumer_metrics GROUP BY consumer_group, topic;
-- Lag alerts
CREATE MATERIALIZED VIEW lag_alerts AS
SELECT * FROM lag_dashboard
WHERE total_lag > 100000 OR max_partition_lag > 50000;
Tools for Consumer Lag Monitoring
| Tool | Type | Real-Time | Streaming SQL |
| Kafka CLI | Command line | Snapshot | ❌ |
| Burrow | Dedicated monitor | ✅ | ❌ |
| Confluent Control Center | GUI | ✅ | ❌ |
| RisingWave | Streaming DB | ✅ | ✅ |
| Grafana + Prometheus | Dashboard | ✅ | ❌ |
Frequently Asked Questions
What causes Kafka consumer lag?
Slow processing (complex transformations), insufficient consumer instances, backpressure from downstream systems, consumer crashes, or sudden traffic spikes.
What is an acceptable consumer lag?
Depends on your SLA. For real-time analytics: <1000 messages. For near-real-time: <10,000. For batch-like consumers: lag is expected and acceptable.

