Streaming Data in FinTech: From Payments to Risk Management
Open table formats (Apache Iceberg, Delta Lake, Apache Hudi) are replacing proprietary data warehouse storage formats. They store data as Parquet files on object storage (S3) with a metadata layer providing ACID transactions, schema evolution, and time travel — at a fraction of the cost of proprietary warehouses.
Why Open Formats Are Winning
| Aspect | Proprietary (Snowflake, Redshift) | Open (Iceberg on S3) |
| Storage cost | $23+/TB/month | ~$2.30/TB/month (S3) |
| Vendor lock-in | High | None |
| Multi-engine | No | Yes (Spark, Trino, DuckDB, Flink) |
| Data portability | Difficult | Open format, portable |
| Streaming ingestion | Limited | Native (Flink, RisingWave) |
The Convergence
Even proprietary vendors are adopting open formats:
- Snowflake: Iceberg Tables support
- Databricks: Delta Lake (now with Iceberg compatibility via UniForm)
- BigQuery: BigLake with Iceberg support
- AWS: S3 Tables with native Iceberg
RisingWave + Open Table Formats
RisingWave writes directly to Iceberg and Delta Lake, creating a streaming pipeline to open storage:
Sources → RisingWave (process with SQL) → Iceberg/Delta (open storage) → Any query engine
Frequently Asked Questions
Which open table format should I choose?
Apache Iceberg for the broadest multi-engine support and vendor neutrality. Delta Lake if you're on Databricks. Apache Hudi for streaming CDC-heavy workloads with frequent upserts.

