The Iceberg REST Catalog specification provides a vendor-neutral HTTP API for managing Iceberg table metadata. RisingWave integrates with any compliant REST catalog — including the open-source Iceberg REST server, Project Nessie, Tabular, AWS Glue (via adapter), and Polaris — using the catalog.type = 'rest' parameter in CREATE SINK.
What Is the Iceberg REST Catalog?
Before REST catalogs, each Iceberg engine implemented its own catalog drivers: a Hive metastore driver, a Glue driver, a JDBC driver. Each had slightly different behaviors, and adding a new engine meant writing a new driver for every catalog type.
The Iceberg REST Catalog specification (part of the Apache Iceberg project) defines a standardized HTTP API for catalog operations:
GET /v1/namespaces— list namespacesPOST /v1/namespaces/{namespace}/tables— create a tableGET /v1/namespaces/{namespace}/tables/{table}— load table metadataPOST /v1/namespaces/{namespace}/tables/{table}/metrics— report metricsPOST /v1/transactions/commit— atomic multi-table commits
Any engine that implements the REST client (RisingWave, Trino, Spark, Flink) can work with any compliant server. This is the future of Iceberg catalog interoperability.
Catalog Options Comparison
| Catalog | Open Source | Multi-Engine | Cloud Managed | Auth Support |
| Iceberg REST server | Yes | Yes | No | OAuth2, basic |
| Project Nessie | Yes | Yes | No | Bearer token |
| AWS Glue (via REST adapter) | Partial | Yes | Yes | IAM/SigV4 |
| Tabular | No | Yes | Yes | OAuth2 |
| Polaris (Snowflake) | Yes | Yes | Yes | OAuth2 |
| Gravitino (Apache) | Yes | Yes | No | OAuth2, basic |
Deploying the Open-Source REST Catalog
For local development and self-hosted production, use the official Iceberg REST catalog Docker image:
# docker-compose.yml
services:
iceberg-catalog:
image: tabulario/iceberg-rest:latest
environment:
AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
CATALOG_WAREHOUSE: s3://my-bucket/warehouse
CATALOG_IO__IMPL: org.apache.iceberg.aws.s3.S3FileIO
CATALOG_S3_ENDPOINT: https://s3.us-east-1.amazonaws.com
ports:
- "8181:8181"
Once running, verify with:
curl http://localhost:8181/v1/config
# Returns: {"defaults":{},"overrides":{}}
Configuring RisingWave Sinks
The most common RisingWave integration pattern: create a materialized view, then sink to Iceberg via REST catalog.
-- Aggregate IoT data from Kafka
CREATE SOURCE temperature_events (
device_id VARCHAR,
location VARCHAR,
celsius DOUBLE PRECISION,
recorded_at TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'iot.temperature',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
)
FORMAT PLAIN ENCODE JSON;
CREATE MATERIALIZED VIEW hourly_temperature AS
SELECT
device_id,
location,
window_start,
window_end,
AVG(celsius) AS avg_celsius,
MIN(celsius) AS min_celsius,
MAX(celsius) AS max_celsius,
COUNT(*) AS sample_count
FROM TUMBLE(temperature_events, recorded_at, INTERVAL '1 HOUR')
GROUP BY device_id, location, window_start, window_end;
-- Sink to Iceberg via REST catalog
CREATE SINK temperature_sink AS
SELECT * FROM hourly_temperature
WITH (
connector = 'iceberg',
type = 'append-only',
catalog.type = 'rest',
catalog.uri = 'http://iceberg-catalog:8181',
warehouse.path = 's3://my-lakehouse/warehouse',
s3.region = 'us-east-1',
database.name = 'iot',
table.name = 'hourly_temperature'
);
RisingWave calls the catalog's POST /v1/namespaces/iot/tables/hourly_temperature endpoint to create the table if it doesn't exist, then uses the catalog to commit each snapshot.
Authentication Configuration
Most production catalogs require authentication. RisingWave supports OAuth2 bearer tokens and basic authentication for REST catalog connections:
-- With OAuth2 bearer token (e.g., Tabular, Polaris)
CREATE SINK secure_sink AS
SELECT * FROM my_mv
WITH (
connector = 'iceberg',
type = 'upsert',
primary_key = 'id',
catalog.type = 'rest',
catalog.uri = 'https://catalog.tabular.io/ws/my-workspace',
catalog.credential = 'my-oauth2-token',
warehouse.path = 's3://my-bucket/warehouse',
s3.region = 'us-east-1',
database.name = 'production',
table.name = 'events'
);
For AWS Glue via the REST catalog adapter, configure IAM-based authentication through environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) rather than inline credentials.
Namespace Management
Iceberg REST catalogs organize tables into namespaces (equivalent to databases/schemas). Create and manage namespaces via the REST API before creating tables:
# Create a namespace
curl -X POST http://iceberg-catalog:8181/v1/namespaces \
-H "Content-Type: application/json" \
-d '{"namespace": ["production"], "properties": {"owner": "data-team"}}'
# List tables in a namespace
curl http://iceberg-catalog:8181/v1/namespaces/production/tables
RisingWave's database.name maps to the top-level namespace. Multi-level namespaces (e.g., production.analytics) require the catalog to support nested namespaces.
Using Nessie for Git-like Catalog Branching
Project Nessie adds Git-like branching to the Iceberg catalog — you can create a feature branch of your catalog, test schema changes in isolation, and merge back to main. RisingWave works with Nessie via its REST-compatible API:
-- Write to Nessie catalog on a specific branch
CREATE SINK nessie_sink AS
SELECT * FROM my_mv
WITH (
connector = 'iceberg',
type = 'append-only',
catalog.type = 'rest',
catalog.uri = 'http://nessie:19120/iceberg',
catalog.warehouse = 's3://my-bucket/warehouse',
s3.region = 'us-east-1',
database.name = 'analytics',
table.name = 'events'
);
Nessie's branching model is particularly valuable for testing schema migrations: run ALTER TABLE on a branch, validate with Trino, then merge to main without affecting the production stream.
Catalog Health and Observability
Monitor catalog operations with RisingWave's system tables:
-- Check sink status and catalog connectivity
SELECT sink_name, sink_type, connection_params, status
FROM rw_sinks
WHERE sink_type = 'iceberg';
On the catalog side, monitor the metrics endpoint:
# Nessie metrics (Prometheus compatible)
curl http://nessie:19120/q/metrics | grep iceberg_catalog
# Tabular REST catalog reports metrics per table
curl https://catalog.tabular.io/ws/my-workspace/v1/namespaces/production/tables/events/metrics
FAQ
Q: What is the difference between catalog.type = 'rest' and catalog.type = 'storage' in RisingWave?
A: The REST catalog uses an external HTTP service for metadata management and supports multi-writer concurrency control. The storage catalog stores metadata files alongside data files on S3 and is simpler but does not support multi-writer safety.
Q: Can I use the same REST catalog for both RisingWave and Spark? A: Yes. This is the primary value proposition of the REST catalog spec. Both engines implement the same client-side catalog protocol; the server handles concurrency.
Q: How do I configure TLS for the REST catalog connection?
A: Specify an HTTPS catalog.uri. RisingWave respects the system trust store for certificate validation. For self-signed certificates in development, configure the JVM trust store or use a reverse proxy with a valid certificate.
Q: Does the REST catalog support table-level access control? A: It depends on the server implementation. Polaris and Tabular offer fine-grained RBAC at the table level via OAuth2 scopes. The open-source reference implementation has basic namespace-level access control.
Q: What happens to the REST catalog if it goes down while RisingWave is writing? A: RisingWave buffers writes and retries catalog commits with exponential backoff. Data files are already on S3 when the commit is attempted, so no data is lost. Catalog downtime increases write latency but does not lose data.
Get Started
Connect RisingWave to your Iceberg REST catalog today:

