Querying Apache Iceberg with Trino, DuckDB, and Spark
Apache Iceberg tables can be queried by multiple engines simultaneously — Trino for interactive SQL, DuckDB for local analytics, Spark for large-scale processing. This multi-engine support is Iceberg's key advantage over proprietary warehouse formats.
Engine Comparison
| Engine | Best For | Latency | Scale | Setup |
| Trino | Interactive SQL, dashboards | Seconds | Large clusters | Medium |
| DuckDB | Local analytics, notebooks | Sub-second (small data) | Single machine | Easy |
| Spark | Large-scale ETL, ML | Minutes | Massive clusters | Complex |
| Snowflake | Managed analytics | Seconds | Elastic | Easy |
| BigQuery | GCP-native analytics | Seconds | Serverless | Easy |
Trino + Iceberg
-- Configure Iceberg catalog in Trino
-- trino/catalog/iceberg.properties:
-- connector.name=iceberg
-- iceberg.catalog.type=rest
-- iceberg.rest-catalog.uri=http://catalog:8181
SELECT event_date, COUNT(*) as events, AVG(duration) as avg_duration
FROM iceberg.analytics.events
WHERE event_date > DATE '2026-03-01'
GROUP BY event_date ORDER BY event_date;
DuckDB + Iceberg
-- DuckDB reads Iceberg directly from S3
INSTALL iceberg; LOAD iceberg;
SELECT * FROM iceberg_scan('s3://lakehouse/warehouse/analytics/events');
The Multi-Engine Pattern
RisingWave ──→ Iceberg ←── Trino (dashboards)
←── DuckDB (ad-hoc analysis)
←── Spark (ML training)
←── Snowflake (BI reporting)
Write once (via RisingWave), read from any engine.
Frequently Asked Questions
Which query engine should I use with Iceberg?
Trino for interactive SQL and dashboards. DuckDB for local/notebook analysis. Spark for large-scale ETL and ML. Snowflake/BigQuery if you're already on those platforms.
Can I query the same Iceberg table from multiple engines?
Yes. That's Iceberg's core value. Multiple engines read from the same table simultaneously with snapshot isolation. No data copying or format conversion needed.

