When Stream Processing Meets Vector Search – The Real-World Use Cases

When Stream Processing Meets Vector Search – The Real-World Use Cases

The rise of AI-native products is reshaping what people expect from databases. It’s no longer just about processing structured data - users now want databases that can handle both structured data and unstructured data (at least their embeddings), while keeping everything fresh in real time if needed. As a result, vector search is gradually becoming a native capability of streaming databases, and the use cases for vector search and stream processing are increasingly overlapping.

Why is this happening? Is this just another marketing stunt by database vendors trying to ride the AI hype? Not really. Let me walk you through a few real-world use cases to show why these scenarios are actually very interesting - and why they make sense.

Real-Time RAG: Where Streaming Meets Embeddings

One of the most obvious scenarios where stream processing and vector search overlap is real-time RAG (Retrieval-Augmented Generation). This is not a generic AI marketing gimmick - there are real, high-value use cases.

Take Kaito AI, which you can think of as Bloomberg for crypto. In the world of crypto, news moves markets. Tweets, on-chain transactions, and breaking news can affect prices in seconds. Kaito aggregates real-time data from various sources - Twitter feeds, news articles, and market data streams - and performs sentiment analysis to provide actionable insights.

Previously, this required a complex pipeline:

  1. Ingest and process data in RisingWave for real-time stream ingestion and transformations.

  2. Push the processed data into Elasticsearch to build vector indexes and enable semantic search.

This approach worked but came with significant overhead - two different systems to manage, more data pipelines to maintain, and extra latency introduced by the data transfer.

RisingWave offers built-in vector search.

With vector indexing now integrated into RisingWave, Kaito AI can process, index, and query all in one place. It’s not just simpler to operate; developers get a unified SQL interface for both real-time analytics and vector queries, and the infrastructure cost drops because there’s no longer a need for a separate search engine.

Not Everything Needs Real-Time RAG

Real-time RAG isn’t for everyone. Many customers don’t actually need millisecond-level retrieval of embeddings. What they really need is a unified database where vector indexes sit side by side with tables and ordinary indexes.

Take one of our long-term partners, a global retail chain that owns some of the world’s most iconic brands. Their primary challenge is real-time inventory management across multiple stores and thousands of SKUs. This is why they chose RisingWave early on - to process streams from Kafka and keep inventory tables up to date with second-level latency.

But with the AI wave, their requirements have evolved. They want to build a “smart store assistant.” Instead of manually checking stock or expiration dates, employees simply use a mobile app to scan shelves or take photos. The system automatically recognizes products, calculates stock levels, and highlights items close to expiry. Operations managers can instantly see which stores need restocking and where inventory is piling up.

Combining real-time structured data with vector search in a single database.

This used to be a slow, manual task. Now, by combining real-time structured data (stock levels, expiration dates) with vector search (product descriptions, image embeddings) in a single database, they can achieve an integrated and automated workflow.

Could they achieve this with a standalone vector database? Sure. But that would mean managing two separate backends, syncing updates, and writing queries that span both systems. With RisingWave’s native vector indexing, everything happens in one place - streaming, structured analytics, and semantic search - dramatically simplifying their architecture.

Industry Trend & Future of Databases

The database industry is shifting fast. Vector search is no longer a feature that belongs only to standalone vector databases. It’s becoming as fundamental as B-tree indexes or JSON support in modern data platforms. PostgreSQL has added vector indexing. MongoDB is doing the same. And now RisingWave is pushing this further by combining streaming capabilities with native vector search.

Why does this matter? Because modern applications increasingly need to blend structured and unstructured data. It’s no longer just about querying tables with stock levels or transaction logs - developers want to join these structured fields with embeddings and semantic matches in a single query.

For vertical scenarios like financial analytics (real-time news and sentiment) or retail operations (real-time inventory plus product embeddings), a unified database drastically reduces complexity and cost. Instead of maintaining multiple pipelines and services, you get one system with one query interface.

Looking ahead, this convergence is not just a trend - it’s the new baseline. In the future, all databases will evolve into AI-native databases, not because they chase AI hype, but because vector indexes and streaming have become essential parts of the modern data stack.

The Modern Backbone for Your
Event-Driven Infrastructure
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.