Open Table Formats: Why Open Data Formats Are Winning
Stream processing in 2026 is defined by four trends: SQL-first interfaces, disaggregated state on S3, AI agent integration, and lakehouse convergence. These trends are reshaping how organizations build and operate real-time data infrastructure.
Trend 1: SQL-First Streaming
SQL is replacing Java as the primary interface for stream processing. RisingWave, Flink SQL, and Materialize are gaining adoption over Kafka Streams and Flink DataStream API. Most streaming logic is expressible in SQL; custom Java operators are needed only for edge cases.
Trend 2: Disaggregated State
Storing streaming state on S3 instead of local RocksDB is becoming the standard. RisingWave (Hummock) and Flink 2.0 (ForSt) both adopt this pattern, enabling faster recovery (seconds vs minutes), elastic scaling, and lower storage costs.
Trend 3: AI Agent Integration
IBM's $11B acquisition of Confluent confirms: real-time data is the engine of enterprise AI. Streaming databases provide the context layer for AI agents — materialized views that serve always-current data via PostgreSQL protocol.
Trend 4: Lakehouse Convergence
Streaming and batch are converging on the lakehouse. Streaming databases sink to Iceberg for historical analytics; batch tools read from Iceberg for reporting and ML. One open format, multiple engines.
Predictions for 2027
- Streaming SQL becomes the default for new data engineering projects
- Disaggregated state replaces local RocksDB entirely in cloud deployments
- Every AI agent framework includes streaming database connectors
- Apache Iceberg becomes the universal table format
Frequently Asked Questions
Is stream processing replacing batch processing?
Not entirely, but the boundary is shifting. Streaming is becoming the default for new data pipelines, with batch reserved for historical analysis, ML training, and ad-hoc queries. The lakehouse pattern (streaming + Iceberg) enables both from a single architecture.

