GUIDE

What Is a Streaming Database?

Q: Is a streaming database the same as a stream processor?

No. A stream processor like Apache Flink processes data in motion but does not store results for querying. A streaming database like RisingWave both processes streams and stores materialized results that you can query with standard SQL at any time.

Q: Can a streaming database replace my traditional database?

A streaming database complements rather than replaces traditional databases. It handles the real-time layer — continuous ingestion, transformation, and serving — while your OLTP database continues to handle transactional workloads.

Q: What data sources can a streaming database connect to?

RisingWave connects to Kafka, Pulsar, Kinesis, and other message brokers for stream ingestion. It also supports CDC from PostgreSQL, MySQL, and MongoDB, plus direct S3 and file-based sources.

Q: How does a streaming database handle late or out-of-order data?

Streaming databases use watermarks and event-time processing to handle late arrivals. RisingWave automatically manages watermarks and recomputes affected materialized views when late data arrives, ensuring correctness without manual intervention.

A streaming database continuously ingests, processes, and serves data in real-time using SQL. Learn how streaming databases differ from traditional databases, message queues, and stream processors — and when to use one.

Try RisingWave Free Streaming SQL Guide

SQL

Native Interface

Use PostgreSQL-compatible SQL to define streaming pipelines and query results — no custom code required

Sub-Second

Data Freshness

Materialized views update continuously with millisecond latency as new data arrives from sources

Exactly-Once

Processing Guarantee

Built-in checkpointing ensures each event is processed exactly once, even during failures

Unified

Ingest + Process + Serve

One system replaces the stream processor, state store, cache, and serving layer

Core Concept

How does a streaming database differ from a traditional database?

A traditional database stores data at rest and executes queries on demand. A streaming database inverts this model: it ingests data continuously, maintains always-up-to-date materialized views, and serves fresh results the moment they are requested. The query runs once; the results update forever.

Dimension	Traditional Database	Streaming Database
Data Model	Data at rest, queried on demand	Data in motion, processed continuously
Query Execution	Run query, get snapshot result	Define query once, results update automatically
Freshness	Stale until next query	Always up-to-date (sub-second)
Ingestion	Batch INSERT / bulk load	Continuous stream ingestion (Kafka, CDC)
Result Storage	Computed on each query	Pre-computed in materialized views
Latency	Query-time compute (seconds to minutes)	Pre-computed reads (milliseconds)

Traditional databases excel at transactional workloads where data is written and read in discrete operations. Streaming databases excel when you need continuous transformations — aggregations, joins, filters — applied to data the instant it arrives, with results always ready to serve.

→Kafka stores and transports events but cannot compute aggregations or joins across topics
→Custom Kafka consumers require hand-written state management, serialization logic, and failure recovery
→A streaming database replaces the consumer + cache + API layer with a single SQL query that produces a materialized view
→Downstream applications read from the materialized view using any PostgreSQL-compatible driver

Capabilities

What problems does a streaming database solve that message queues cannot?

Message queues like Kafka transport data reliably between systems but lack the ability to transform, join, or aggregate that data. A streaming database adds a full SQL computation layer on top of streams, maintaining stateful results that downstream applications can query directly — without building custom consumers.

SQL Over Streams

Write JOINs, GROUP BY, window functions, and subqueries against live data streams — no custom consumer code required.

Stateful Computation

The database manages state internally. Aggregations, counts, and running totals are maintained automatically and survive failures.

Queryable Results

Materialized views store pre-computed results. Any PostgreSQL client can read them with sub-millisecond latency.

Exactly-Once Semantics

Built-in checkpointing guarantees each event is processed exactly once, even during node failures or restarts.

RisingWave

How does RisingWave implement the streaming database paradigm?

RisingWave is a cloud-native streaming database that uses PostgreSQL-compatible SQL for both defining streaming pipelines and querying results. It ingests from Kafka, Pulsar, Kinesis, and CDC sources, processes data through materialized views with exactly-once guarantees, and serves results via any PostgreSQL client.

→Define sources with CREATE SOURCE to connect to Kafka, Pulsar, Kinesis, or databases via CDC
→Create materialized views that continuously compute over those sources using standard SQL
→Query materialized views with SELECT for instant, always-fresh results
→Sink results to downstream systems like PostgreSQL, Kafka, Elasticsearch, or ClickHouse

Frequently Asked Questions

Is a streaming database the same as a stream processor?

Can a streaming database replace my traditional database?

What data sources can a streaming database connect to?

How does a streaming database handle late or out-of-order data?

Ready to build with a streaming database?

Start building real-time streaming pipelines with SQL in minutes.

Start Building with RisingWave

Streaming SQL Guide →Materialized Views →