Open Table Formats: Why Open Data Formats Are Winning

Open Table Formats: Why Open Data Formats Are Winning

Streaming Data in FinTech: From Payments to Risk Management

Open table formats (Apache Iceberg, Delta Lake, Apache Hudi) are replacing proprietary data warehouse storage formats. They store data as Parquet files on object storage (S3) with a metadata layer providing ACID transactions, schema evolution, and time travel — at a fraction of the cost of proprietary warehouses.

Why Open Formats Are Winning

AspectProprietary (Snowflake, Redshift)Open (Iceberg on S3)
Storage cost$23+/TB/month~$2.30/TB/month (S3)
Vendor lock-inHighNone
Multi-engineNoYes (Spark, Trino, DuckDB, Flink)
Data portabilityDifficultOpen format, portable
Streaming ingestionLimitedNative (Flink, RisingWave)

The Convergence

Even proprietary vendors are adopting open formats:

  • Snowflake: Iceberg Tables support
  • Databricks: Delta Lake (now with Iceberg compatibility via UniForm)
  • BigQuery: BigLake with Iceberg support
  • AWS: S3 Tables with native Iceberg

RisingWave + Open Table Formats

RisingWave writes directly to Iceberg and Delta Lake, creating a streaming pipeline to open storage:

Sources → RisingWave (process with SQL) → Iceberg/Delta (open storage) → Any query engine

Frequently Asked Questions

Which open table format should I choose?

Apache Iceberg for the broadest multi-engine support and vendor neutrality. Delta Lake if you're on Databricks. Apache Hudi for streaming CDC-heavy workloads with frequent upserts.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.