Postgres CDC to Iceberg: Lessons from Real-World Data Pipelines

Postgres CDC to Iceberg: Lessons from Real-World Data Pipelines

·

10 min read

Over the past year, Postgres and Apache Iceberg have been at the center of attention in the data infrastructure space. The acquisitions of Neon by Databricks and CrunchyData by Snowflake have only reinforced this trend. Postgres has long established itself as the de facto standard for OLTP, while Iceberg, with its open table format and rapidly growing ecosystem, is becoming the default foundation for modern data lakehouses, replacing parts of traditional cloud data warehouses.

However, connecting these two systems in real-time is far from trivial. To continuously replicate Postgres changes into Iceberg, you need a reliable CDC (Change Data Capture) pipeline. Tools like Debezium and Kafka Connect have been around for years, and at first glance, they seem to have solved this problem. I used to believe the same.

But after diving into real-world production cases and discussing the technical details with dozens of engineering teams, I realized that streaming data from Postgres CDC to Iceberg is far from a “solved problem.” Both upstream capture and downstream writes are full of subtle but critical challenges. These challenges span across: end-to-end exactly-once guarantees, fault recovery, snapshot performance, schema evolution, transactional consistency in Iceberg, and the operational complexity of multi-component architectures.

In this post, I’ll break down these challenges one by one and share lessons learned from real-world scenarios.

1. Exactly-Once Guarantees and Fault Recovery

The biggest challenge in building a reliable CDC pipeline is achieving true end-to-end exactly-once semantics. Most tools only provide at-least-once delivery, which sounds acceptable on paper but quickly falls apart in real-world production incidents. Duplicate events, data loss, and manual replays often surface as painful issues.

Upstream Fault Recovery

While Postgres replication slots are designed to be robust, they’re not bulletproof. If Postgres crashes, the network glitches, or a replication slot is unexpectedly cleaned up due to WAL retention policies, most CDC tools can’t seamlessly recover. A common issue is LSN (Log Sequence Number) misalignment between the CDC reader (e.g., Debezium) and downstream systems. When this happens, the pipeline may either miss events or reprocess them, leading to data inconsistencies.

I’ve seen multiple teams forced to re-snapshot the entire database after replication slot failures. For datasets in the hundreds of gigabytes or even terabytes, this is slow, costly, and disruptive—especially if table locks are involved.

One e-commerce team I worked with had Debezium connected to Postgres. A brief database outage wiped out the replication slot. They ended up re-running snapshots on all tables, which took almost 10 hours, consumed massive I/O, and left the analytics team waiting while business operations stalled.

Downstream Isolation

Downstream systems like Iceberg or Snowflake can fail just as easily—write failures or temporary outages are not uncommon. Without built-in buffering or backpressure control, many CDC pipelines grind to a halt. Once Kafka topics start piling up and exceed retention limits, you’re faced with a nasty “data loss vs. duplicate replay” dilemma, usually followed by manual backfills.

We encountered a case where an AWS S3 outage caused an Iceberg sink to fail continuously. Debezium and Kafka messages started to back up, and the backpressure propagated all the way upstream to Postgres. Because Debezium wasn’t advancing its low LSN in this state, WAL logs kept piling up. By the time the issue was detected, the WAL had nearly filled up the disk, almost taking down the primary database. This kind of “chain reaction” is a nightmare scenario—and it’s a hidden risk many teams underestimate when relying on Debezium.

The End-to-End Exactly-Once Problem

Even though tools like Debezium, Kafka, and Flink all implement checkpoints and retries, stitching them together into a true end-to-end exactly-once pipeline is notoriously difficult. The handoffs between components often create windows for duplicates or dropped data, particularly during backfill operations.

One key challenge is aligning snapshots (initial full data dumps) with WAL (Write-Ahead Log) streams.

The simplest approach is to lock the upstream tables during the snapshot to block new writes, and only switch to WAL consumption once the snapshot completes. This ensures consistency but blocks live writes (e.g., schema change), which is unacceptable for most production workloads.

If you don’t lock tables, you face the tricky problem of ensuring that data written during the snapshot window isn’t missed or double-processed.

Systems like RisingWave and Flink CDC handle this by consuming WAL in parallel with the snapshot and using downstream state tables to align the snapshot with WAL events. This avoids both data loss and upstream blocking.

Furthermore, in recovery or restart scenarios, if checkpoints and Kafka offsets are not perfectly aligned, you can easily end up with over-reads or under-reads. Precise WAL offset management is critical for maintaining exactly-once semantics, but this responsibility often falls to the CDC engine itself.

Iceberg introduces another set of challenges: it does not natively support primary-key-level idempotent updates. If you simply append or batch-write data without carefully using equality deletes or merge-on-read modes, duplicate records will creep in. Once written, these duplicates are expensive to clean up.

For industries like finance or compliance, this is unacceptable. I once helped a trading company debug their CDC pipeline. Misalignment between the snapshot and WAL stream, combined with non-idempotent updates to Iceberg, led to incorrect calculations for their positions. They ended up implementing a complicated deduplication process downstream, adding both latency and operational complexity.

2. Performance and Resource Overhead

CDC pipelines that look fine in PoC environments often fall apart at production scale. These pipelines must handle both initial full snapshots and continuous incremental changes, each with its own set of problems.

Replication Slot Resource Pressure

Replication slots in Postgres are a limited resource. In most production setups, one database maps to one replication slot, not one per table.

In RisingWave’s design, for example, a single shared source maps to a Postgres instance and uses one slot. While Postgres can create multiple slots, the active slots typically map directly to subscribers.

However, when tools like Debezium are deployed in multi-tenant or multi-connector scenarios, misconfigurations can lead to an excessive number of replication slots. This can slow down WAL cleanup, increase vacuum pressure, and impact database I/O performance.

I worked with a customer who only had 2 slots configured, yet Debezium’s slow consumption caused WAL logs to pile up by hundreds of gigabytes. Archiving couldn’t keep up, and the database started to struggle. They had to manually tweak Debezium’s throughput and purge WAL logs just to stabilize the system.

Snapshot and Backfill Bottlenecks

Initial snapshots are notoriously painful. Traditional CDC tools often use single-threaded sequential scans. On tables with hundreds of millions of rows, this process can take hours or even days. If something fails mid-way, you start all over again—wasting time and compute.

An even bigger problem is aligning snapshots with incremental WAL data. Some implementations simply lock the upstream tables during snapshotting to prevent new writes, switching to WAL consumption only after the snapshot completes. This is simple but blocks production writes.

RisingWave (and Flink CDC) take a better approach: consuming WAL events in parallel while snapshotting, aligning snapshot and WAL data via downstream state tables to guarantee data completeness without blocking upstream workloads.

I’ve seen a SaaS company using Debezium run into this exact problem. Snapshotting a large table crushed their OLTP performance. They ended up running snapshots only at night, pausing CDC during the day. What should have been a 2-day migration stretched into a week.

Large Fields and Memory Issues

Postgres’s TOAST fields (e.g., large JSON objects) are another challenge. Debezium often loads these large fields into memory during processing, and when multiple large transactions occur, this can lead to OOM errors. Teams often resort to manual workarounds like field splitting or compression, but these are band-aid solutions.


3. Data Types and Schema Evolution

Schema evolution is one of the most underestimated challenges in CDC pipelines. Modern Postgres schemas are dynamic—adding columns, changing types, and rebuilding indexes are all common. These changes can easily break CDC pipelines or create downstream inconsistencies if not properly handled.

Partitioned Tables and TOAST

Debezium’s support for partitioned tables and TOAST is limited. TOAST fields require enabling full replica identity, which significantly increases WAL size. New partitions in partitioned tables often require manual connector restarts, introducing risk and downtime.

Schema Evolution in Multi-Sink Environments

Things get even messier when syncing to multiple downstream systems (e.g., Iceberg, Snowflake, Redshift). A single column addition can trigger a cascade of updates:

  • Adjusting Debezium configs,

  • Updating Kafka Schema Registry,

  • Rewriting Flink jobs,

  • Altering Iceberg table structures.

Many teams spend considerable effort building automation for schema propagation, but type mismatches and missing defaults frequently cause breakages.

4. Iceberg Writes and Consistency

Iceberg is batch-oriented, while CDC produces continuous row-level updates. Bridging this gap requires careful handling of transaction boundaries and file layout.

Transaction Boundaries and Exactly-Once

CDC events must be batched before being committed to Iceberg. If a commit fails mid-way, Iceberg lacks automatic rollback, which can leave the table in an inconsistent state. Replaying failed batches without idempotency handling easily causes duplicates, since Iceberg doesn’t natively support primary-key deduplication. RisingWave addresses this with transactional batch commits and exactly-once semantics, ensuring each batch is either fully committed or skipped, even under failure or restarts.

Small File Problem and Compaction

One of the biggest challenges when streaming CDC data into Iceberg is the small files problem. Because CDC events are row-level and continuous, naïvely writing each batch creates hundreds or thousands of small Parquet files. These small files explode the size of Iceberg’s metadata (manifest files) and slow down query planning since the query engine needs to scan and filter an enormous file list. Traditionally, teams schedule manual compaction jobs—for example, using Spark or Flink—merging small files into larger ones at regular intervals. But this introduces additional latency, cost, and operational overhead.

RisingWave automatically performs file compaction during ingestion. It intelligently groups CDC events into optimal file sizes and periodically merges small files in-place without requiring external jobs. This reduces the query latency on downstream engines (like Trino, Flink, or Spark) and keeps Iceberg’s metadata lean.

5. Operational Complexity

A typical Postgres-to-Iceberg pipeline involves Debezium for capture, Kafka for buffering, Flink for transformations, and a custom Iceberg sink. Every component needs separate monitoring, tuning, and recovery procedures.

I’ve seen teams that essentially dedicate an SRE squad to babysit this setup. Debugging requires combing through multiple logs and offset checkpoints. Even a simple connector restart can trigger data replays or gaps if offsets aren’t managed correctly.

RisingWave’s Approach

RisingWave was built to simplify this mess. It unifies CDC capture, stream processing, and Iceberg writes into a single system, eliminating multi-component overhead and reducing failure points.

For CDC ingestion, RisingWave embeds Debezium’s engine under the hood but has made extensive optimizations:

  • Reworked WAL parsing and memory management for large transactions and high concurrency.

  • Improved offset and LSN persistence, ensuring precise recovery without full re-snapshots.

  • Enhanced schema parsing to prevent slowdowns during complex schema changes.

  • Integrated Debezium tightly with RisingWave’s log store and state tables, enabling precise alignment of snapshots and WAL streams.

Key design principles include:

  • Robust fault recovery: LSN tracking ensures smooth recovery without full replays.

  • Downstream isolation: A persistent log store decouples sinks from sources, preventing downstream failures from blocking upstream CDC.

  • Efficient snapshots: Parallel, lock-free snapshots significantly shorten initial load times without impacting OLTP workloads.

  • Native Iceberg support: Transactional batch writes, equality deletes, and built-in auto compaction that continuously merges small files during ingestion—removing the need for external Spark or Flink compaction jobs and keeping Iceberg metadata lean.

  • Schema auto-evolution: Automatic detection and propagation of upstream schema changes, minimizing manual adjustments.

  • Ease of use and observability: A single integrated system replaces the Debezium-Kafka-Flink chain with built-in monitoring and simplified failure recovery.

Success Story: Siemens

A real-world example of this simplified architecture is Siemens. Before adopting RisingWave, their CDC pipeline looked like: SQLServer → Debezium → Kafka → RisingWave. This multi-component setup required maintaining numerous Debezium parameters and carefully managing Kafka offsets, making the pipeline fragile and difficult to scale.

With RisingWave’s commercial edition, Siemens can now connect directly to SQL Server—no Debezium or Kafka configuration is needed. For Siemens’ engineers, Debezium and Kafka have effectively disappeared from their workflow. CDC has turned into a plug-and-play black box, reducing operational overhead and configuration complexity dramatically.

Conclusion

On the surface, Postgres CDC to Iceberg may look like a solved problem. In reality, the complexity of upstream replication, snapshot and backfill performance, schema evolution, Iceberg’s transactional semantics, and the operational burden of multi-component pipelines all prove otherwise.

RisingWave’s mission is to make this simple.

By unifying CDC capture, real-time processing, and Iceberg ingestion in a single system, RisingWave provides high reliability, low latency, end-to-end exactly-once guarantees, automatic schema evolution, and built-in auto compaction, effectively eliminating the manual overhead that plagues traditional pipelines.

The Modern Backbone for Your
Event-Driven Infrastructure
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.