RisingWave vs ClickHouse for Real-Time Analytics

Data engineers comparing tools for real-time analytics often land on the same shortlist: RisingWave and ClickHouse. Both are fast. Both use SQL. Both appear in architectural diagrams labeled "real-time." But they are fundamentally different kinds of systems, and conflating them leads to architectures that are either over-engineered or wrong for the workload.

This article breaks down what each system actually does, where each excels, and when you need both working together.

What Each System Is Built For

The single most important thing to understand about this comparison is that RisingWave and ClickHouse solve different problems.

ClickHouse is a columnar OLAP (online analytical processing) database. It is engineered for extremely fast analytical queries over large volumes of historical data. You load data into ClickHouse, and then you query it at very high speed. Its columnar storage format, vectorized query execution, and aggressive compression make it one of the fastest systems available for aggregations and scans over billions of rows. ClickHouse is fundamentally a read-optimized store: you write data in, you query it.

RisingWave is a streaming database. It continuously ingests data from streams (Kafka, CDC, Kinesis, and more), processes that data incrementally using materialized views, and keeps the results always up to date. When a new event arrives in Kafka, RisingWave propagates the change through its dataflow graph within milliseconds. You query a materialized view and get the current answer without waiting for a batch job to complete.

The fundamental distinction: ClickHouse answers questions about data you have already collected. RisingWave maintains continuously updated answers as data arrives.

Architecture Overview

ClickHouse Architecture

ClickHouse stores data in a columnar format optimized for batch reads. Its MergeTree table engine family stores data in sorted parts on disk, merging them in the background to maintain efficient storage layout. This design enables exceptional scan throughput: ClickHouse can read and aggregate billions of rows in seconds by using all available CPU cores and SIMD instructions.

Data reaches ClickHouse through batch inserts. You typically write to ClickHouse from Kafka using the ClickHouse Kafka engine, or via periodic batch loads from a data lake. Either way, there is a delay between an event occurring and that event appearing in query results. For many analytical workloads, a delay of seconds to minutes is acceptable.

ClickHouse is not designed to maintain continuously updated derived datasets. If you want to precompute an aggregation, ClickHouse offers materialized views, but they update on insert, not incrementally across arbitrary updates. They work well for append-only streams but not for workloads involving updates or deletes.

RisingWave Architecture

RisingWave decouples compute, storage, and metadata into independent layers:

Compute nodes run streaming operators and serve queries.
Compactor nodes handle background storage compaction.
Object storage (S3, GCS, Azure Blob) persists all state durably through Hummock, a purpose-built LSM-tree storage engine.
Meta service coordinates checkpointing and scheduling.

When you create a materialized view in RisingWave, the system builds a persistent dataflow graph. Each new event triggers incremental computation: only the affected rows and aggregations are recomputed, not the entire dataset. This is why RisingWave can maintain accurate real-time results over high-velocity streams without expensive full scans.

Because each architectural layer scales independently, you can add compute without provisioning more storage, or vice versa. State lives in object storage, not on local disks, which means recovery is fast and there is no disk capacity planning.

RisingWave uses the PostgreSQL wire protocol, so you connect with any PostgreSQL client (psql, JDBC, Python psycopg2, any standard BI tool).

SQL Experience: Familiar Interface, Different Semantics

Both systems use SQL, but the semantics reflect their different designs.

ClickHouse SQL

ClickHouse supports a rich SQL dialect with analytical extensions: WINDOW functions, SAMPLE clauses for approximate queries, specialized aggregate combinators (-If, -Array, -State), and powerful grouping sets. For a data analyst running ad-hoc queries over historical data, ClickHouse SQL feels natural and expressive.

A typical ClickHouse analytical query:

-- ClickHouse: scan last 30 days of order history, group by category
SELECT
    category,
    uniq(user_id)         AS unique_buyers,
    sum(amount)           AS total_revenue,
    avg(amount)           AS avg_order_value
FROM orders
WHERE event_time >= now() - INTERVAL 30 DAY
GROUP BY category
ORDER BY total_revenue DESC;

This query scans existing rows. It runs fast because ClickHouse is columnar and can skip irrelevant data using sparse indexes. But the result reflects data up to the last insert, not the current moment.

RisingWave SQL

RisingWave implements PostgreSQL-compatible SQL. You define streaming pipelines as materialized views, and the system keeps them current as data arrives. There is no need to write a batch job or schedule a refresh.

The same concept in RisingWave uses a materialized view that updates continuously:

-- RisingWave: create a materialized view that is always current
CREATE MATERIALIZED VIEW ch_order_stats_by_status AS
SELECT
    status,
    COUNT(*)           AS order_count,
    SUM(amount)        AS total_revenue,
    AVG(amount)        AS avg_order_value
FROM ch_orders
GROUP BY status;

When a new order row arrives, RisingWave updates only the affected aggregate in milliseconds. Querying this view:

SELECT * FROM ch_order_stats_by_status ORDER BY total_revenue DESC;

returns the current state instantly, because the work was already done incrementally. There is no scan at query time.

RisingWave also supports enriching streams through joins:

CREATE MATERIALIZED VIEW ch_enriched_orders AS
SELECT
    o.order_id,
    o.user_id,
    o.amount,
    o.status,
    o.event_time,
    p.product_name,
    p.category
FROM ch_orders o
JOIN ch_products p ON o.product_id = p.product_id;

And cascading materialized views, where one view builds on another:

CREATE MATERIALIZED VIEW ch_category_revenue AS
SELECT
    category,
    COUNT(*)   AS order_count,
    SUM(amount) AS total_revenue
FROM ch_enriched_orders
GROUP BY category;

Both of these update continuously. No scheduled jobs. No delay.

Windowed Aggregations

For time-windowed analytics, RisingWave provides SQL-standard window functions using TUMBLE, HOP, and SESSION:

CREATE MATERIALIZED VIEW ch_revenue_last_5min AS
SELECT
    window_start,
    window_end,
    COUNT(*)   AS order_count,
    SUM(amount) AS total_revenue
FROM TUMBLE(ch_orders, event_time, INTERVAL '5 MINUTES')
GROUP BY window_start, window_end;

This view maintains a rolling count of orders and revenue for each 5-minute window. The results update as events arrive, not after a batch completes. For more on how RisingWave handles time windows, see the windowing documentation.

ClickHouse can compute the same result on demand with a query, but the query scans raw data each time. For dashboards that refresh every few seconds, this can become expensive at scale.

Feature Comparison

Dimension	RisingWave	ClickHouse
Primary use case	Continuous stream processing, incremental materialized views	Fast OLAP queries on historical data
Data model	Streaming sources + incrementally maintained views	Columnar tables optimized for batch reads
Materialized views	Incrementally maintained, always current	Insert-triggered, append-only friendly
Latency	Sub-100ms end-to-end stream processing	Milliseconds to seconds for analytical queries
Stream ingestion	Native (Kafka, CDC, Kinesis, Pulsar, S3, 50+ connectors)	Via Kafka table engine or batch insert
Handles updates/deletes	Yes (via CDC, upsert sources)	Limited (ReplacingMergeTree, CollapsingMergeTree)
Query on historical data	Supported but not the primary strength	Exceptional; built for this
SQL dialect	PostgreSQL-compatible	ClickHouse dialect (ANSI extensions)
State management	Built-in, tiered (memory + object storage)	Columnar parts on local disk
Scaling model	Compute and storage scale independently	Horizontal sharding, replicas
BI tool compatibility	PostgreSQL wire protocol	Native ClickHouse driver, HTTP API
License	Apache 2.0	Apache 2.0
Managed cloud	RisingWave Cloud	ClickHouse Cloud

Latency Profiles

Understanding latency in both systems requires distinguishing between two different questions.

End-to-end streaming latency is the time from an event entering the system (e.g., arriving in Kafka) to that event being reflected in a query result. RisingWave targets sub-100ms latency here as its default mode. ClickHouse latency depends on the insert pipeline: if you are using the Kafka engine, rows are visible after the next insert batch, which can be configured from seconds to minutes.

Query execution latency is the time to run a SQL query against stored data. ClickHouse excels here for large analytical scans. A query scanning billions of rows can complete in seconds. RisingWave's materialized views pre-compute the answer, so point queries against materialized views are fast (10-20ms p99), but RisingWave is not a columnar scan engine for ad-hoc historical analysis.

The practical implication:

For a dashboard showing "revenue in the last 5 minutes," RisingWave delivers lower end-to-end latency because the answer is maintained continuously.
For a query scanning 3 years of historical orders to find seasonal trends, ClickHouse is faster because it is purpose-built for that type of columnar scan.

When to Use Each System

Use RisingWave When

You need sub-second freshness. Fraud detection, live leaderboards, real-time alerting, and operational dashboards require data to be current within milliseconds or seconds. RisingWave's incremental computation model is designed for this.

Your source data involves updates or deletes. ClickHouse's core table engines are optimized for append-only data. Handling updates (e.g., order status changes, user profile updates) requires workarounds like ReplacingMergeTree or CollapsingMergeTree with careful query logic. RisingWave handles CDC (Change Data Capture) natively, reflecting inserts, updates, and deletes accurately in materialized views. See RisingWave's CDC documentation for details.

You want to join streams with tables. Enriching a high-velocity event stream with product catalog data or user profile lookups is a common pattern. RisingWave handles stream-table joins and stream-stream joins incrementally, keeping enriched results current as both the stream and the dimension table change.

You want to build multi-stage pipelines in SQL. Cascading materialized views let you define complex transformation chains (raw events -> sessionized events -> per-user metrics -> company-level rollups) entirely in SQL, without orchestration tools or external batch jobs.

Your team lives in PostgreSQL. RisingWave's PostgreSQL wire compatibility means existing tooling (psql, JDBC connectors, Metabase, Grafana, dbt) works without modification.

Use ClickHouse When

You need fast ad-hoc queries on large historical datasets. ClickHouse's columnar storage, vectorized execution, and aggressive compression make it exceptionally fast for scanning and aggregating billions of rows. For analytical workloads where the questions change frequently and data latency of minutes or more is acceptable, ClickHouse is hard to beat.

Your workload is primarily append-only. Log analytics, event analytics, and time-series data that never changes are ideal for ClickHouse. Its MergeTree family is optimized for this pattern.

You need complex analytical functions. ClickHouse's aggregate function library is extensive: approximate distinct counts (uniq, uniqHLL12), quantile estimation (quantile, quantileTDigest), histograms, and statistical functions. For exploratory data analysis over historical data, this library is a strength.

You have high-cardinality historical reports. Queries like "compare revenue by product, broken down by region, month, and acquisition channel for the past two years" are ClickHouse's sweet spot. This type of query benefits from full columnar scans with vectorized aggregation.

When to Use Both Together

Many production architectures use RisingWave and ClickHouse as complementary layers:

RisingWave processes the real-time stream, maintaining current aggregates, detecting anomalies, and producing enriched events.
RisingWave sinks processed data to ClickHouse, using RisingWave's ClickHouse sink connector.
ClickHouse serves historical queries over the accumulated dataset, with all the performance benefits of columnar storage.

This pattern separates concerns cleanly: RisingWave handles the "what is happening now" question, while ClickHouse handles "what has happened over the past year." The BI layer queries both, routing requests to whichever system is appropriate for the time range and freshness requirement.

A concrete example: an e-commerce platform might use RisingWave to detect and alert on unusual order volume in real time (looking at the last 5 minutes), and ClickHouse to render a year-over-year sales comparison report (scanning two years of history). The two systems are not in competition; they are solving different subproblems in the same analytics stack.

Operational Considerations

Deployment and Management

ClickHouse is a mature, well-understood database with a large operator community. Deploying and managing a ClickHouse cluster requires standard database administration skills: provisioning storage, managing replication, tuning insert settings and merge behavior. ClickHouse Cloud reduces this burden significantly with a managed offering.

RisingWave follows a cloud-native model. Its compute-storage separation means storage scaling happens automatically via S3; you add or remove compute nodes to adjust processing capacity. The system deploys as a single binary for local development or as a Kubernetes Helm chart for production. RisingWave Cloud provides a fully managed option.

Monitoring and Observability

Both systems expose Prometheus-compatible metrics. ClickHouse provides detailed metrics on queries, merges, inserts, and replica health. RisingWave provides metrics on streaming pipeline throughput, checkpoint latency, backpressure, and materialized view lag.

For RisingWave, the most important operational metrics are:

Checkpoint interval and duration (indicates pipeline health)
Per-materialized-view processing lag (freshness indicator)
Backpressure on source connectors (indicates whether the pipeline keeps up)

Cost

ClickHouse's costs depend primarily on storage (columnar, heavily compressed) and compute (CPU-heavy for large scans). For workloads that are mostly read-heavy with infrequent inserts, ClickHouse is storage-efficient.

RisingWave's costs center on compute (for stream processing) and object storage (for state). Because state lives in S3, RisingWave's storage costs scale with the amount of maintained state, not with historical data volume. For workloads that need to maintain only recent aggregates (e.g., last 7 days), RisingWave's state footprint can be quite small.

When running both together, the total cost includes both systems. The benefit is that each system is used only for what it does best, avoiding the overprovisioning that comes from trying to force one system to serve both streaming and historical workloads.

FAQ

Is RisingWave a replacement for ClickHouse?

No. They serve genuinely different purposes. RisingWave is a streaming database designed to maintain continuously updated results over high-velocity event streams. ClickHouse is a columnar OLAP database designed to scan and aggregate large volumes of historical data at high speed. If your primary need is fast historical analytics with acceptable data freshness of minutes or more, ClickHouse is the right tool. If your primary need is sub-second freshness and continuous stream processing, RisingWave is the right tool. Many teams use both.

Can ClickHouse do real-time streaming?

ClickHouse can ingest data from Kafka using its Kafka table engine, and new data becomes queryable after the next insert batch, which can be configured to run on short intervals. However, ClickHouse does not maintain incrementally updated materialized views in the same way RisingWave does. Aggregations in ClickHouse are recomputed at query time from raw data. For use cases that require sub-second freshness and continuous incremental computation, ClickHouse is not designed for that workload.

Does RisingWave support ClickHouse as a sink?

Yes. RisingWave has a native ClickHouse sink connector. You can stream processed results from RisingWave directly into ClickHouse tables, making it straightforward to build pipelines where RisingWave handles real-time stream processing and ClickHouse serves historical analytical queries. See the RisingWave ClickHouse sink documentation for configuration details.

How do materialized views differ between RisingWave and ClickHouse?

RisingWave materialized views are incrementally maintained: when a source row changes, only the affected aggregates or join results are recomputed, not the entire view. They are always current and update in milliseconds. ClickHouse materialized views trigger on insert and recompute their query for each new block of inserted data. They work well for append-only streams but do not handle updates or deletes to source data correctly without careful schema design. For workloads where source data changes after initial insert (order status updates, user profile changes), RisingWave materialized views handle this correctly; ClickHouse materialized views require workarounds.

Conclusion

RisingWave and ClickHouse are not competing for the same job. ClickHouse is one of the best columnar OLAP databases available for fast historical analytics. RisingWave is a streaming database that keeps query results continuously current as data arrives. The choice between them is not about performance benchmarks against each other; it is about what kind of problem you are solving.

Key takeaways:

ClickHouse excels at fast ad-hoc analytical queries over large historical datasets, especially append-only workloads.
RisingWave excels at continuous stream processing, incrementally maintained materialized views, and sub-second freshness for operational use cases.
RisingWave handles updates and deletes via CDC naturally; ClickHouse requires careful schema design workarounds.
Both work well together: RisingWave processes the real-time stream and sinks to ClickHouse, which serves historical analysis.
SQL experience differs: RisingWave uses PostgreSQL-compatible SQL; ClickHouse uses its own dialect. Both are SQL, but the tooling ecosystem implications are different.

If you are building a real-time analytics stack and trying to decide where each system fits, start with the freshness requirement. If you need answers in milliseconds reflecting events that just happened, that is RisingWave's domain. If you need fast answers over years of historical data where a few minutes of delay is fine, that is ClickHouse's domain. Many production stacks need both.

Ready to try RisingWave? Get started in minutes at docs.risingwave.com/get-started.

Join the RisingWave Slack community to ask questions and connect with other stream processing engineers.