Apache Iceberg is an open table format for large analytic datasets. It brings ACID transactions, schema evolution, time travel, and partition evolution to data lakes stored on object storage like S3, GCS, or ADLS — enabling reliable, high-performance analytics without vendor lock-in.
What Is Apache Iceberg?
Apache Iceberg was originally developed at Netflix to solve reliability and performance problems at petabyte scale. Donated to the Apache Software Foundation in 2018, it has since become the de facto standard for open table formats, adopted by companies like Apple, LinkedIn, Adobe, and thousands of others.
Unlike traditional Hive-style partitioning that relies on directory structures, Iceberg tracks data at the file level through a tree of metadata. This design solves long-standing problems in data lake architectures: unsafe concurrent writes, slow partition discovery, and the inability to change table schemas without full rewrites.
Core Architecture: How Iceberg Stores Data
Understanding Iceberg requires understanding its three-layer metadata architecture:
Catalog Layer — The entry point. A catalog (REST, Hive Metastore, AWS Glue, Nessie) stores pointers to the current metadata file for each table.
Metadata Layer — A chain of JSON metadata files that track the current schema, partition spec, sort order, and a list of snapshots.
Data Layer — Parquet, ORC, or Avro files organized into manifests. Each snapshot points to a manifest list, which references manifest files, which reference the actual data files.
This design enables several powerful capabilities:
- Snapshot isolation — Each write creates a new snapshot. Readers always see a consistent view.
- Incremental reads — Consumers can read only files added since a known snapshot.
- Hidden partitioning — Partition transforms (bucket, truncate, year, month, day, hour) are tracked in metadata, not in the file path.
Key Features Every Data Engineer Should Know
ACID Transactions
Iceberg provides serializable isolation for writes and snapshot isolation for reads. Multiple writers can operate on the same table concurrently without corrupting data. Optimistic concurrency control detects conflicts and retries as needed.
Schema Evolution
Add, drop, rename, or reorder columns without rewriting data. Iceberg tracks column IDs internally, so existing files remain readable even after schema changes. This is a fundamental improvement over Hive tables, where schema changes often required rewriting the entire dataset.
Time Travel and Rollback
Query historical snapshots using AS OF syntax. Roll back a table to any previous snapshot in case of bad writes or accidental deletes.
Partition Evolution
Change the partitioning strategy of a table without rewriting existing data. Old files remain queryable under the old partition spec, while new files use the updated spec — completely transparent to query engines.
Row-Level Deletes and Updates
Iceberg supports both copy-on-write (COW) and merge-on-read (MOR) strategies for row-level operations, making it suitable for CDC workloads where individual rows must be updated or deleted.
Comparison: Iceberg vs. Hive Tables vs. Delta Lake
| Feature | Apache Iceberg | Hive Tables | Delta Lake |
| ACID Transactions | Yes (serializable) | Limited | Yes (serializable) |
| Schema Evolution | Full (add/drop/rename/reorder) | Limited (add/rename) | Full |
| Partition Evolution | Yes (without rewrite) | No | Partial |
| Time Travel | Yes | No | Yes |
| Hidden Partitioning | Yes | No | No |
| Multi-Engine Support | Excellent (open spec) | Yes | Good (primarily Spark) |
| Incremental Reads | Yes (snapshot diff) | No | Yes (via change feed) |
| Catalog Options | REST, Hive, Glue, Nessie, JDBC | Hive Metastore | Delta catalog |
Integrating Iceberg with RisingWave
RisingWave is a cloud-native streaming SQL database that natively integrates with Apache Iceberg. You can stream data into Iceberg tables from Kafka, Postgres CDC, MySQL CDC, and other sources — and in RisingWave v2.8+, you can also query Iceberg tables directly using SQL.
Streaming Data into Iceberg
First, create a source to ingest events:
CREATE SOURCE user_events (
user_id BIGINT,
event_type VARCHAR,
payload JSONB,
event_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'user-events',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;
Then build a materialized view to aggregate or transform the data:
CREATE MATERIALIZED VIEW hourly_event_counts AS
SELECT
user_id,
event_type,
window_start,
window_end,
COUNT(*) AS event_count
FROM TUMBLE(user_events, event_time, INTERVAL '1 HOUR')
GROUP BY user_id, event_type, window_start, window_end;
Finally, sink the results into an Iceberg table:
CREATE SINK events_to_iceberg AS
SELECT * FROM hourly_event_counts
WITH (
connector = 'iceberg',
type = 'upsert',
catalog.type = 'rest',
catalog.uri = 'http://iceberg-catalog:8181',
warehouse.path = 's3://my-data-lake/warehouse',
s3.region = 'us-east-1',
database.name = 'analytics',
table.name = 'hourly_event_counts'
);
Querying Iceberg Tables Directly (v2.8+)
In RisingWave v2.8, you can register Iceberg tables as sources and query them with standard SQL, enabling lakehouse-style analytics alongside streaming workloads:
-- Register an existing Iceberg table as a source
CREATE SOURCE iceberg_orders
WITH (
connector = 'iceberg',
catalog.type = 'rest',
catalog.uri = 'http://iceberg-catalog:8181',
warehouse.path = 's3://my-data-lake/warehouse',
database.name = 'commerce',
table.name = 'orders'
);
-- Join streaming data with historical Iceberg data
SELECT
s.user_id,
s.event_count AS realtime_events,
h.total_orders AS historical_orders
FROM hourly_event_counts s
JOIN iceberg_orders h ON s.user_id = h.user_id;
Iceberg Catalogs: Which One Should You Use?
| Catalog | Best For | Notes |
| REST Catalog | Cloud-native deployments | Vendor-neutral; works with Tabular, Nessie, Polaris |
| AWS Glue | AWS deployments | Fully managed, integrates with Athena, EMR |
| Hive Metastore | Legacy Hadoop environments | Widely supported but requires HMS infrastructure |
| Nessie | Multi-table transactions, Git-like branching | Open-source; excellent for data versioning |
| JDBC | Simple setups | Good for testing; not recommended for production |
Best Practices for Data Engineers
Compaction — Iceberg tables accumulate small files over time, especially with streaming writes. Run periodic compaction jobs (using Spark or Flink) to merge small files into larger ones, improving query performance.
Snapshot expiration — Old snapshots retain references to deleted data files. Run expireSnapshots on a schedule to reclaim storage.
Partitioning strategy — Choose partition transforms that match your query patterns. For time-series data, partitioning by day or hour is common. Avoid high-cardinality partitions.
Primary key design — For upsert workloads (like CDC), define a clear primary key on your Iceberg table. This allows RisingWave's iceberg sink to efficiently handle updates and deletes.
FAQ
Q: Can I use Apache Iceberg without Spark? Yes. Iceberg is a fully open specification. You can read and write Iceberg tables using Flink, Trino, Presto, DuckDB, RisingWave, StarRocks, Snowflake, and many other engines. Spark is not required.
Q: Does Iceberg support streaming writes? Yes. Iceberg supports incremental appends, upserts, and deletes through streaming systems like RisingWave and Apache Flink. Each micro-batch commit creates a new snapshot, enabling downstream consumers to read incremental changes.
Q: What's the difference between copy-on-write and merge-on-read? Copy-on-write (COW) rewrites entire data files when rows are updated or deleted, producing clean files but at higher write cost. Merge-on-read (MOR) appends delete files alongside data files, reducing write amplification but requiring more work at read time. For streaming CDC workloads, MOR is usually preferred.
Q: How does Iceberg handle late-arriving data?
Iceberg itself is storage-agnostic about data ordering. You can append late data at any time. Query engines can filter by event time using predicate pushdown. For time-windowed aggregations, use RisingWave's TUMBLE() or HOP() windows to handle late arrivals within configurable bounds.
Q: Is Apache Iceberg free to use? Apache Iceberg is fully open-source under the Apache 2.0 license. There is no licensing cost. You pay only for the storage and compute you use.
Getting Started
Apache Iceberg combined with RisingWave gives you a complete, open-source streaming lakehouse: real-time ingestion, continuous SQL transformations, and durable storage in a vendor-neutral open format.
Ready to build your first Iceberg pipeline? Start with the RisingWave documentation for a hands-on quickstart, or join the community on Slack to ask questions and share what you're building.

