Debezium Offset Management: What It Is and What Goes Wrong

Debezium stores its progress in offsets — a record of exactly where in the database's change log the connector last successfully read. When offsets are lost, corrupted, or misapplied, your pipeline either replays data it already processed (duplicates) or skips data it should have captured (gaps). This is the most common class of Debezium production incident.

What Offsets Actually Store

Offsets are not Kafka consumer offsets. They are Debezium's internal bookmark into the database change stream. The content varies by connector:

PostgreSQL connector offset:

{
  "ts_usec": 1711929600000000,
  "lsn": 25565696,
  "txId": 491,
  "last_snapshot_record": false,
  "snapshot": false
}

MySQL connector offset:

{
  "ts_sec": 1711929600,
  "file": "mysql-bin.000123",
  "pos": 4567890,
  "gtids": "3E11FA47-71CA-11E1-9E33-C80AA9429562:1-23",
  "snapshot": false
}

The LSN (Log Sequence Number) for PostgreSQL and the binlog file + position for MySQL tell Debezium exactly where to resume reading when it restarts. This is separate from the Kafka topic offset, which tracks consumer group progress within a Kafka topic.

Where Offsets Are Stored

Kafka Connect mode (default): Offsets are stored in a dedicated Kafka topic, typically named connect-offsets. This topic is compacted to retain the latest value per partition key.

# In the Kafka Connect worker config
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
offset.storage.partitions=25

Embedded Engine mode: Offsets are stored in a local file on disk.

DebeziumEngine<ChangeEvent<String, String>> engine = DebeziumEngine
    .create(Json.class)
    .using(props)
    .notifying(record -> process(record))
    .build();

offset.storage=org.apache.kafka.connect.storage.FileOffsetBackingStore
offset.storage.file.filename=/var/lib/debezium/offsets.dat

The file-based store is fine for development. In production, losing this file means losing your position entirely.

What Happens When Offsets Are Lost

If the offset store is unavailable when the connector starts, Debezium has no position to resume from. Its behavior is controlled by the snapshot.mode configuration:

`snapshot.mode`	Behavior when offsets are missing
`initial` (default)	Takes a full snapshot, then streams from current WAL position
`always`	Takes a full snapshot every start
`never`	Skips snapshot, reads from current WAL position — may miss data
`initial_only`	Takes snapshot only, stops after
`exported`	Uses a previously exported snapshot slot
`custom`	User-defined logic

The most common consequence of lost offsets with snapshot.mode=initial: your Kafka topic receives all historical rows again, causing duplicates in downstream systems that are not idempotent.

Common Failure Scenarios

Scenario 1: Kafka connect-offsets topic is deleted or compaction fails.

The connector restarts, finds no committed offset, and starts a new snapshot. Downstream consumers receive duplicate rows for every historical record.

Scenario 2: Connector restarted with a different name field.

Debezium uses the connector name as part of the offset key. If you recreate a connector with a new name (e.g., orders-connector-v2 instead of orders-connector), it has no prior offset and will snapshot from scratch.

Scenario 3: PostgreSQL replication slot is dropped while the connector is down.

If the connector is offline long enough that the slot is manually dropped (or the database is restored from backup), the LSN stored in the offset is now invalid. The connector fails to resume:

ERROR Caught exception while fetching sequence of changes.
org.postgresql.util.PSQLException: ERROR: replication slot "debezium" does not exist

Recovery requires dropping and recreating the connector, which triggers a new snapshot.

Scenario 4: Binlog rotation causes MySQL offset to become invalid.

If expire_logs_days is set aggressively and the connector was offline, the binlog file referenced in the stored offset may have been rotated away. The connector fails with:

FATAL Connector requires binlog file 'mysql-bin.000123', but it is no longer available.

Resetting a Connector to a Specific Point in Time

You cannot directly set a Debezium connector to replay from an arbitrary timestamp without manipulating the offset store. However, you can achieve point-in-time replay with these steps:

For PostgreSQL (using LSN):

# Find the LSN for a specific timestamp
psql -c "SELECT pg_lsn_to_string(lsn), timestamp
         FROM pg_logical_emit_message(false, 'debezium', '')
         WHERE timestamp > '2024-04-01 00:00:00'
         LIMIT 1;"

# Or query WAL directly
psql -c "SELECT pg_current_wal_lsn();"

Then manually update the offset in the connect-offsets topic using a Kafka producer, setting lsn to the target value. This is advanced and risky — test in staging first.

The safer approach: delete the offset and use snapshot.mode=initial to replay from the beginning, then fast-forward by reprocessing downstream.

# Delete connector (removes task state but not offset)
curl -X DELETE http://localhost:8083/connectors/orders-connector

# Manually delete the offset entry from connect-offsets topic
# This requires producing a tombstone record with the connector's partition key

# Recreate with snapshot.mode=initial
curl -X POST http://localhost:8083/connectors \
  -H "Content-Type: application/json" \
  -d '{
    "name": "orders-connector",
    "config": {
      "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
      "snapshot.mode": "initial",
      ...
    }
  }'

Protecting Your Offsets in Production

Replicate the offset topic. Use replication.factor=3 for connect-offsets. A single-replica offset topic is a single point of failure.

Never delete the replication slot while the connector is active. For PostgreSQL:

-- Check existing replication slots and their lag
SELECT slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag
FROM pg_replication_slots;

Monitor offset lag, not just connector status. The connector can show RUNNING while the stored offset is hours behind the live WAL position.

How RisingWave Manages CDC Position

RisingWave uses the Debezium Embedded Engine internally, but WAL position management is integrated into RisingWave's own checkpointing system — not a separate offset store.

When RisingWave takes a checkpoint, it atomically persists:

The current WAL position (LSN for PostgreSQL, binlog position for MySQL)
The materialized view state consistent with that position

On restart, RisingWave reads its checkpoint, restores the WAL position, and resumes CDC from exactly where it stopped. There is no Kafka offset topic to lose, no replication slot to manage separately.

-- Check RisingWave CDC source status
SELECT * FROM rw_sources WHERE name = 'orders_source';

-- Check materialized view freshness
SELECT * FROM rw_materialized_views WHERE name = 'orders_summary';

The tradeoff: RisingWave's WAL position is coupled to its checkpoint interval (default 10 seconds). Recovery always replays from the last checkpoint, not from an arbitrary LSN.

FAQ

Can I copy the offset store from one Kafka cluster to another? Yes, using MirrorMaker2 or Kafka's kafka-consumer-groups.sh reassignment. But the offset store format is Debezium-specific — you must copy the compacted connect-offsets topic, not the downstream change event topics.

What is the difference between Debezium offsets and Kafka consumer group offsets? Debezium offsets track position in the database's WAL or binlog. Kafka consumer group offsets track how far downstream consumers have read from Kafka topics. They are independent. A connector can be fully caught up (Debezium offset = current WAL) while downstream consumers are significantly lagged.

How do I prevent offset loss in embedded mode? Store the offset file on a durable volume (not ephemeral container storage). In Kubernetes, mount a PersistentVolumeClaim for /var/lib/debezium/. Consider switching to Kafka-based offset storage even in embedded deployments if durability matters.

If my connector is down for a week, what happens to the PostgreSQL replication slot? The slot holds onto WAL segments from the moment the connector stopped. Your PostgreSQL disk will fill up if the WAL segments are not allowed to be recycled. Monitor pg_replication_slots.pg_wal_lsn_diff and set a maximum slot lag policy. Many teams drop idle slots after 24-48 hours, accepting that a full resnapshot is needed on reconnect.

Can I run two Debezium connectors against the same table for redundancy? No. Each connector needs its own replication slot. Two connectors reading the same slot would conflict. Active-passive failover with a shared offset store is the supported pattern for redundancy.