Querying Apache Iceberg with Trino, DuckDB, and Spark

Querying Apache Iceberg with Trino, DuckDB, and Spark

Querying Apache Iceberg with Trino, DuckDB, and Spark

Apache Iceberg tables can be queried by multiple engines simultaneously — Trino for interactive SQL, DuckDB for local analytics, Spark for large-scale processing. This multi-engine support is Iceberg's key advantage over proprietary warehouse formats.

Engine Comparison

EngineBest ForLatencyScaleSetup
TrinoInteractive SQL, dashboardsSecondsLarge clustersMedium
DuckDBLocal analytics, notebooksSub-second (small data)Single machineEasy
SparkLarge-scale ETL, MLMinutesMassive clustersComplex
SnowflakeManaged analyticsSecondsElasticEasy
BigQueryGCP-native analyticsSecondsServerlessEasy

Trino + Iceberg

-- Configure Iceberg catalog in Trino
-- trino/catalog/iceberg.properties:
-- connector.name=iceberg
-- iceberg.catalog.type=rest
-- iceberg.rest-catalog.uri=http://catalog:8181

SELECT event_date, COUNT(*) as events, AVG(duration) as avg_duration
FROM iceberg.analytics.events
WHERE event_date > DATE '2026-03-01'
GROUP BY event_date ORDER BY event_date;

DuckDB + Iceberg

-- DuckDB reads Iceberg directly from S3
INSTALL iceberg; LOAD iceberg;
SELECT * FROM iceberg_scan('s3://lakehouse/warehouse/analytics/events');

The Multi-Engine Pattern

RisingWave ──→ Iceberg ←── Trino (dashboards)
                       ←── DuckDB (ad-hoc analysis)
                       ←── Spark (ML training)
                       ←── Snowflake (BI reporting)

Write once (via RisingWave), read from any engine.

Frequently Asked Questions

Which query engine should I use with Iceberg?

Trino for interactive SQL and dashboards. DuckDB for local/notebook analysis. Spark for large-scale ETL and ML. Snowflake/BigQuery if you're already on those platforms.

Can I query the same Iceberg table from multiple engines?

Yes. That's Iceberg's core value. Multiple engines read from the same table simultaneously with snapshot isolation. No data copying or format conversion needed.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.