Highlights of RisingWave v2.6

We’re excited to introduce RisingWave v2.6, our latest release packed with enhancements across performance, reliability, and integration. In this article, we’ll highlight some of the key updates — including native vector data support, broader Apache Iceberg integration, improved PostgreSQL CDC handling, new sink connectors with automatic schema updates, and a premium memory-only mode for ultra-low-latency workloads.

These are just a few of the many enhancements included in this release. If you are interested in the full list of v2.6 updates, see the full release note.

Store and query vector data for similarity-based applications

RisingWave now supports storing and querying high-dimensional vector embeddings with the new vector(n) data type, enabling you to build real-time similarity search applications like recommendation engines or semantic search directly within the database. You can perform these searches using new vector functions and the <-> distance operator. For example, you can find the five items most similar to a given vector with a query like this:

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

To accelerate these queries, we have also introduced experimental support for HNSW (Hierarchical Navigable Small World) indexes on vector columns. The vector(n) type ensures data integrity by enforcing a fixed length and disallowing invalid values. For users familiar with PostgreSQL, this feature set is designed for compatibility with the popular pgvector extension, simplifying integration and migration.

To explore this feature in more depth, see the following guides:

Vector Data Type — Learn how to define and manage vector columns in RisingWave.
Vector Indexes — Understand how to create and use vector indexes for efficient similarity search.
CREATE INDEX Command — Review the syntax and options for creating indexes, including vector-specific parameters.

Enhance Iceberg integration and features

This release brings several major enhancements to our Iceberg integration, covering table management, performance optimization, and ecosystem connectivity:

Copy-on-Write (CoW) mode: RisingWave now supports the Copy-on-Write (CoW) mode for Apache Iceberg sinks, providing a more robust way to handle high-frequency data updates. This mode ensures that external applications can always read a stable and clean view of the data, free from intermediate delete files, by maintaining separate branches for data ingestion and the clean, compacted main view.
VACUUM for Iceberg Tables and Sinks: We’ve introduced the VACUUM SQL command to manually manage Iceberg storage. VACUUM expires outdated snapshots and reclaims storage space, while VACUUM FULL compacts data files and expires snapshots for optimal performance.
Broader ecosystem integration: RisingWave now makes it easier to connect with Iceberg across multiple platforms:
- Sink to Iceberg tables and query with Databricks
- Sink to Iceberg tables and query with Snowflake
Iceberg REST catalog with Lakekeeper: RisingWave can now work seamlessly with Lakekeeper, a hosted Iceberg REST catalog. This setup simplifies creating and ingesting data into a REST catalog. Lakekeeper can be deployed via Docker for quick local testing or Kubernetes (Helm) for production.

For more information, see Integrate Apache Iceberg with RisingWave.

TOAST columns and parallel backfilling for PG CDC

This update improves the reliability of PostgreSQL CDC (Change Data Capture) by correctly handling TOAST columns during data updates. Now, when you update a row in PostgreSQL, RisingWave correctly retains the values of large, unchanged columns, preventing them from being replaced by placeholders and ensuring your data remains consistent.

For more information, see Support for PostgreSQL TOAST.

New sink connector and automatic schema changes for sinks

You can now stream data into RedShift from RisingWave from this release onwards.

Sinks for Snowflake, Redshift, and Elasticsearch now support end-to-end automatic schema changes for newly added columns. By setting auto.schema.change = 'true' when creating a sink, it will automatically reflect these new columns without manual intervention. Note that sink decoupling is not supported when this feature is enabled.

For more information, see Auto schema change for Elasticsearch Sink, Snowflake Sink, and Redshift Sink.

Enable memory-only mode for ultra-low-latency workloads

RisingWave v2.6 introduces memory-only mode, a premium feature designed to maximize performance for latency-sensitive workloads. In this mode, operator states can be fully loaded and kept in memory, eliminating cache misses as long as the available memory exceeds the size of the required intermediate states. By removing the usual serialization and deserialization overhead, memory-only mode delivers significantly lower query latency.

You can also configure memory-only mode at the operator level, choosing which operators keep their state in memory and which continue using the standard state store. This flexibility allows fine-grained control to balance ultra-low latency against memory consumption, especially when memory resources are limited compared to the total state size.

This feature marks a major step forward in advancing our RisingWave Ultra offering.

For more information, see State table memory preload.

Conclusion

These are some of the highlight features included in v2.6. To see the entire list of updates, which includes updates to source and sink connectors, please refer to the full release note.

Stay tuned for next month’s updates as we continue to enhance RisingWave with new features. Visit the RisingWave GitHub repository to explore the latest developments and planned releases.

Sign up for our monthly newsletter if you’d like to keep up to date on all the happenings with RisingWave. Follow us on Twitter and LinkedIn, and join our Slack community to talk to our engineers and hundreds of streaming enthusiasts worldwide.