You Don't Need a Separate Vector Database

You Don't Need a Separate Vector Database

·

13 min read

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What does a vector database actually do?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "A vector database does two things: it stores high-dimensional float arrays (embeddings) and it executes approximate nearest-neighbor (ANN) search over them efficiently. The storage model is straightforward; embeddings are stored alongside metadata like document IDs and text. The search is powered by an index algorithm, most commonly HNSW (Hierarchical Navigable Small World). These capabilities are not unique to dedicated vector databases. They are available as extensions or native types in several general-purpose databases, including RisingWave."
      }
    },
    {
      "@type": "Question",
      "name": "Can a streaming database replace a vector database?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "For most RAG and AI agent use cases, yes. RisingWave has a native vector(n) type, HNSW indexing, and a built-in openai_embedding() function. You can store embeddings, build HNSW indexes, and run similarity searches entirely inside RisingWave. The main advantage is that embeddings stay in sync with your source data automatically via CDC and materialized views, eliminating the synchronization gap that dedicated vector databases create."
      }
    },
    {
      "@type": "Question",
      "name": "What is the synchronization problem with a separate vector database?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "When you use a separate vector database, you have two systems: the source database where your documents live, and the vector database where embeddings live. These two systems drift apart whenever a document changes. Keeping them in sync requires a background job, a CDC pipeline, or an application-level hook, all of which introduce latency and failure modes. A streaming database like RisingWave stores both the source data and the embeddings in one place, so there is no synchronization gap."
      }
    },
    {
      "@type": "Question",
      "name": "When should you still use a dedicated vector database?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Dedicated vector databases are worth their operational cost in a few specific scenarios: very high-dimensional embeddings (above 4096 dimensions) where specialized compression matters, pure ANN workloads at extreme scale (billions of vectors) where the entire system is optimized for nothing but search throughput, and cases where your existing infrastructure already centers on a vector DB and migration cost exceeds the benefit. For most RAG pipelines with a few million documents or fewer, a streaming database with native vector support is sufficient."
      }
    },
    {
      "@type": "Question",
      "name": "How does RisingWave keep embeddings fresh without a separate sync pipeline?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "RisingWave uses CDC (change data capture) to read changes from your PostgreSQL or MySQL source database and maintains a materialized view that re-generates embeddings for changed rows automatically using the built-in openai_embedding() function. When a document is updated, only that document's embedding is recomputed. The HNSW index updates incrementally. No separate sync job or pipeline is needed."
      }
    }
  ]
}

Introduction

The standard advice for building a RAG system is to add a vector database. Your documents live in PostgreSQL. Your embeddings go to Pinecone. Your application queries Pinecone for retrieval, then passes the results to the LLM.

This architecture works. It is also two systems, two bills, two operational domains, and one synchronization problem that will quietly degrade your retrieval quality over time.

Most teams accept this complexity as a given. The vector database is a specialized tool, and specialized tools deserve a dedicated deployment. But this assumption deserves closer examination. What does a vector database actually do? And is it doing anything that a streaming database with native vector support cannot do better, with fewer moving parts?

For most RAG and AI agent use cases, the answer is no. This article explains why, shows the SQL to prove it, and acknowledges the cases where a dedicated vector database is genuinely the right choice.

The Standard RAG Stack and Its Hidden Problem

The typical production RAG pipeline looks like this:

  1. Documents live in PostgreSQL (or MySQL, or MongoDB)
  2. A batch job runs on a schedule: extract documents, generate embeddings, upsert to Pinecone
  3. At query time, the application embeds the user's question and queries Pinecone for similar documents
  4. The top results are passed as context to an LLM

The hidden problem is step 2: the batch job. Between runs, the source database and the vector database are out of sync. Documents that were updated, deleted, or added in PostgreSQL are invisible to Pinecone until the next job completes.

For a knowledge base that changes slowly, this lag is tolerable. For anything where freshness matters (product catalogs, support policies, internal wikis, regulatory documents), users are retrieving stale embeddings. The LLM's answer is only as fresh as the last sync.

The sync problem is not solved by running the batch job more frequently. At some point, you are running it every few minutes, at significant API cost, and you still have a staleness window. The root cause is architectural: two separate systems, drifting apart continuously.

What a Vector Database Actually Does

Before deciding whether you need a dedicated vector database, it helps to understand what it actually does. The two core functions are:

1. Embedding storage: A vector database stores float arrays (embeddings) alongside metadata such as document IDs, text, and filters. This is a storage problem with no fundamental magic. The embeddings are just numbers in a table.

2. Approximate nearest-neighbor (ANN) search: Given a query embedding, return the N stored embeddings most similar to it. The standard algorithm is HNSW (Hierarchical Navigable Small World), which builds a graph structure for efficient search. HNSW is why vector search is fast: it avoids comparing the query against every stored embedding.

That is it. Embedding storage and ANN search. These are well-understood capabilities. HNSW is available as an extension in PostgreSQL (via pgvector), as a native type in RisingWave, and implemented directly in DuckDB, SQLite, and others.

Dedicated vector databases do these two things very well, often with additional features like multi-tenancy, access control, and hybrid search (combining dense vector search with keyword search). But they do not do anything that cannot be done inside a general-purpose database for the vast majority of workloads.

RisingWave is a PostgreSQL-compatible streaming database with native vector support. That combination eliminates the need for a separate vector store for most RAG architectures.

The relevant capabilities:

  • vector(n) type: Store embeddings as fixed-dimension float arrays natively. No extension required.
  • HNSW index: Build an approximate nearest-neighbor index directly on a vector column.
  • openai_embedding(api_key, model, input): Built-in function that calls the OpenAI embeddings API and returns a REAL[] value. Works inside materialized views.
  • Native CDC from PostgreSQL and MySQL: Read changes from your source database in real time, no Kafka or Debezium required.
  • Materialized views: SQL queries whose results update incrementally as source data changes.

These capabilities compose naturally. You define a materialized view that reads documents from the source database via CDC, calls openai_embedding() on each one, and stores the result. RisingWave maintains that view continuously. When a document changes, its embedding updates automatically.

Comparison: Separate Vector DB vs RisingWave as Vector Store

Separate Vector DB (e.g., Pinecone)RisingWave as Vector Store
Embedding freshnessLag of batch sync interval (minutes to hours)Seconds: materialized view updates on source change
Sync pipeline requiredYes: batch job or CDC pipeline to vector DBNo: RisingWave reads source DB directly
Operational complexityTwo systems to monitor and maintainOne system
Cost modelSource DB + Vector DB (storage + query costs)RisingWave handles both
SQL query capabilityVector search only; no joins, aggregations, or filters beyond metadataFull SQL: join vectors with other tables, filter, aggregate
Freshness on document deleteRequires explicit delete call in application or sync jobAutomatic: materialized view removes the row when source deletes it
SetupDeploy vector DB, configure sync, manage API keys in two placesOne CREATE MATERIALIZED VIEW with openai_embedding()

The freshness and query capability columns are where the difference is most significant. A dedicated vector database stores embeddings and returns similar ones. RisingWave can join the similarity results against a live inventory table, filter by a real-time computed attribute, or aggregate over a window, all in a single query.

SQL: Building a Vector Store Inside RisingWave

Here is the complete setup for a knowledge base RAG system where embeddings are stored and searched in RisingWave.

Create the source table with CDC

-- Tested against RisingWave v2.8.0

-- CDC source: read from PostgreSQL knowledge base
CREATE SOURCE pg_source WITH (
    connector = 'postgres-cdc',
    hostname = 'postgres.internal',
    port = '5432',
    username = 'rw_user',
    password = 'your_password',
    database.name = 'knowledge_base',
    slot.name = 'rw_vector_slot',
    publication.name = 'rw_publication'
);

-- Bind the documents table to the CDC source
CREATE TABLE documents (
    doc_id      BIGINT PRIMARY KEY,
    title       VARCHAR,
    body        TEXT,
    category    VARCHAR,
    status      VARCHAR,
    updated_at  TIMESTAMPTZ
) FROM pg_source TABLE 'public.documents';

Store embeddings in a materialized view

-- This materialized view stores embeddings alongside the source text.
-- When a document changes in PostgreSQL, RisingWave re-embeds it automatically.
CREATE MATERIALIZED VIEW doc_vectors AS
SELECT
    doc_id,
    title,
    category,
    body,
    status,
    updated_at,
    openai_embedding(
        'sk-your-openai-api-key',
        'text-embedding-3-small',
        title || '. ' || body
    )::vector(1536) AS embedding
FROM documents
WHERE status = 'active';
-- HNSW index on the embedding column using cosine similarity
CREATE INDEX doc_vectors_hnsw
    ON doc_vectors
    USING hnsw (embedding vector_cosine_ops);

Run a similarity search query

-- Find the 5 documents most similar to a query embedding
-- Replace the vector literal with the embedding from your application
SELECT
    doc_id,
    title,
    category,
    body,
    cosine_distance(
        embedding,
        '[0.012, -0.031, 0.058, ...]'::vector(1536)
    ) AS distance
FROM doc_vectors
ORDER BY embedding <=> '[0.012, -0.031, 0.058, ...]'::vector(1536)
LIMIT 5;

A more powerful query: vector search with a live join

This is something a dedicated vector database cannot do in a single query. Here, the similarity search is joined against a real-time view of product availability:

-- Find similar product documents, but only for products currently in stock
-- product_inventory is a separate materialized view updated via CDC
SELECT
    dv.doc_id,
    dv.title,
    dv.category,
    pi.stock_status,
    pi.current_price,
    cosine_distance(dv.embedding, '[0.012, -0.031, 0.058, ...]'::vector(1536)) AS distance
FROM doc_vectors dv
JOIN product_inventory pi ON dv.doc_id = pi.product_id
WHERE pi.stock_status IN ('in_stock', 'low_stock')
ORDER BY dv.embedding <=> '[0.012, -0.031, 0.058, ...]'::vector(1536)
LIMIT 5;

product_inventory is another RisingWave materialized view, updated via CDC from the same source database. The similarity search and the stock filter both operate on continuously maintained data. No application-side join logic needed.

When You Still Need a Dedicated Vector Database

This article argues that most teams do not need a dedicated vector database. That claim needs to be honest about its boundaries.

There are real cases where a dedicated vector database is the better choice:

Extreme scale with billions of vectors: Dedicated vector databases are engineered from the ground up for ANN search. At hundreds of millions to billions of vectors with high query throughput requirements, systems like Milvus offer specialized compression, sharding strategies, and hardware optimization that a general-purpose streaming database does not yet match. If your entire workload is high-throughput ANN search and nothing else, a purpose-built system will outperform.

Very high-dimensional embeddings: Some specialized models produce embeddings with 4096 or more dimensions. Dedicated vector databases often implement compression and quantization techniques (like product quantization) that significantly reduce storage and search cost at high dimensions. RisingWave's vector(n) type is best suited for the 768 to 3072 dimension range common in OpenAI and open-source embedding models.

Existing investment in a vector DB: If your team has already built a production system on Pinecone or Milvus and it is working well, migration cost is real. The operational savings of removing the separate system need to exceed the migration effort.

Hybrid search with BM25: Some dedicated vector databases offer first-class hybrid search combining dense vectors with BM25 keyword scoring. If your retrieval quality depends heavily on exact term matching in addition to semantic similarity, a system optimized for hybrid retrieval may serve you better.

For most teams, none of these conditions apply. The typical RAG system has tens of thousands to a few million documents, uses OpenAI's 1536-dimension text-embedding-3-small, and retrieves context for conversational AI applications. That workload fits comfortably inside RisingWave's vector capabilities.

FAQ

What does a vector database actually do?

A vector database does two things: it stores high-dimensional float arrays (embeddings) and it executes approximate nearest-neighbor (ANN) search over them efficiently. The storage model is straightforward; embeddings are stored alongside metadata like document IDs and text. The search is powered by an index algorithm, most commonly HNSW (Hierarchical Navigable Small World). These capabilities are not unique to dedicated vector databases. They are available as extensions or native types in several general-purpose databases, including RisingWave.

Can a streaming database replace a vector database?

For most RAG and AI agent use cases, yes. RisingWave has a native vector(n) type, HNSW indexing, and a built-in openai_embedding() function. You can store embeddings, build HNSW indexes, and run similarity searches entirely inside RisingWave. The main advantage is that embeddings stay in sync with your source data automatically via CDC and materialized views, eliminating the synchronization gap that dedicated vector databases create.

What is the synchronization problem with a separate vector database?

When you use a separate vector database, you have two systems: the source database where your documents live, and the vector database where embeddings live. These two systems drift apart whenever a document changes. Keeping them in sync requires a background job, a CDC pipeline, or an application-level hook, all of which introduce latency and failure modes. A streaming database like RisingWave stores both the source data and the embeddings in one place, so there is no synchronization gap.

When should you still use a dedicated vector database?

Dedicated vector databases are worth their operational cost in a few specific scenarios: very high-dimensional embeddings (above 4096 dimensions) where specialized compression matters, pure ANN workloads at extreme scale (billions of vectors) where the entire system is optimized for nothing but search throughput, and cases where your existing infrastructure already centers on a vector DB and migration cost exceeds the benefit. For most RAG pipelines with a few million documents or fewer, a streaming database with native vector support is sufficient.

How does RisingWave keep embeddings fresh without a separate sync pipeline?

RisingWave uses CDC (change data capture) to read changes from your PostgreSQL or MySQL source database and maintains a materialized view that re-generates embeddings for changed rows automatically using the built-in openai_embedding() function. When a document is updated, only that document's embedding is recomputed. The HNSW index updates incrementally. No separate sync job or pipeline is needed.

Conclusion

The separate vector database is not a requirement. It is a convention inherited from a time when the only way to get ANN search was to deploy a specialized system.

That assumption no longer holds. Native vector types, HNSW indexes, and embedding functions are available in streaming databases. For most RAG workloads, the right architecture is one system, not two.

Key takeaways:

  • A vector database does two things (embedding storage and ANN search), neither of which requires a dedicated system for typical workloads
  • The sync gap is the real problem: two separate systems drift apart continuously, degrading retrieval quality silently
  • RisingWave's vector(n) type, HNSW index, and openai_embedding() function provide the full vector store capability inside a streaming database
  • Materialized views with CDC eliminate the sync pipeline entirely; embeddings update automatically when source documents change
  • Full SQL means richer queries: join similarity results against live inventory, filter by real-time attributes, aggregate over windows, all in one query
  • Dedicated vector databases still win at extreme scale (billions of vectors), very high dimensions, or when existing investment justifies keeping them

If you are starting a new RAG project, try building it inside RisingWave before adding a separate vector database. The architecture is simpler, the embeddings are fresher, and you get full SQL query capability over your entire data model.


Ready to try this yourself? Try RisingWave Cloud free, no credit card required. Sign up here.

Join our Slack community to ask questions and connect with other developers building streaming AI applications.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.