Streaming Data in FinTech: From Payments to Risk Management

Open table formats (Apache Iceberg, Delta Lake, Apache Hudi) are replacing proprietary data warehouse storage formats. They store data as Parquet files on object storage (S3) with a metadata layer providing ACID transactions, schema evolution, and time travel — at a fraction of the cost of proprietary warehouses.

Why Open Formats Are Winning

Aspect	Proprietary (Snowflake, Redshift)	Open (Iceberg on S3)
Storage cost	$23+/TB/month	~$2.30/TB/month (S3)
Vendor lock-in	High	None
Multi-engine	No	Yes (Spark, Trino, DuckDB, Flink)
Data portability	Difficult	Open format, portable
Streaming ingestion	Limited	Native (Flink, RisingWave)

The Convergence

Even proprietary vendors are adopting open formats:

Snowflake: Iceberg Tables support
Databricks: Delta Lake (now with Iceberg compatibility via UniForm)
BigQuery: BigLake with Iceberg support
AWS: S3 Tables with native Iceberg

RisingWave + Open Table Formats

RisingWave writes directly to Iceberg and Delta Lake, creating a streaming pipeline to open storage:

Sources → RisingWave (process with SQL) → Iceberg/Delta (open storage) → Any query engine

Frequently Asked Questions

Which open table format should I choose?

Apache Iceberg for the broadest multi-engine support and vendor neutrality. Delta Lake if you're on Databricks. Apache Hudi for streaming CDC-heavy workloads with frequent upserts.

Open Table Formats: Why Open Data Formats Are Winning