When engineering managers ask "what does real-time analytics actually cost?", they usually get one answer: the cloud infrastructure bill. That number is real, but it typically represents less than 30% of what you will actually spend. The other 70% hides in engineering hours, on-call rotations, specialist hiring, and the operational complexity of stitching together four or five separate systems.
The total cost of ownership for a real-time analytics stack in 2026 spans three architectural patterns that teams most commonly evaluate. The first is the traditional pipeline: Kafka as the message bus, Apache Flink for stream processing, and a dedicated OLAP store such as Apache Pinot or ClickHouse as the serving layer. The second is a consolidated approach: Kafka feeding into RisingWave, a streaming database that processes and serves results in a single system. The third is a fully managed path: Confluent Cloud handling Kafka and Flink together, plus a separate OLAP layer.
This article computes TCO across all three approaches for a concrete reference workload: 100,000 events per second, 20 streaming jobs, and a small data platform team of 3-5 engineers. All cost figures use AWS us-east-1 on-demand pricing as of April 2026 unless noted otherwise.
What Goes Into a Real-Time Analytics Stack
Before comparing costs, it helps to be precise about what "real-time analytics stack" means. Most production deployments need four capabilities:
- Ingestion: Accepting a continuous stream of events from applications, databases (via CDC), or IoT devices.
- Processing: Applying transformations, joins, aggregations, and filters to the raw stream.
- Serving: Storing the processed results and making them queryable with low latency.
- Observability: Monitoring lag, throughput, and freshness across the pipeline.
The traditional approach assigns a dedicated system to each of these functions. That specialization brings flexibility but compounds operational complexity. Each boundary between systems is a potential point of failure, a latency tax, and a new operational domain for your team to master.
The Reference Workload
The scenarios below use a consistent reference workload to make comparisons fair:
| Dimension | Value |
| Peak throughput | 100,000 events/sec |
| Event size (avg) | 1 KB |
| Total state size | 1 TB (across all jobs) |
| Streaming jobs | 20 concurrent |
| Query concurrency | 50 dashboard users |
| Team size | 3 engineers, 1 manager |
| SLA | p99 query latency under 200 ms |
This workload is representative of a mid-size e-commerce or fintech company running real-time dashboards, fraud detection, and customer-facing analytics simultaneously.
Stack 1: Kafka + Flink + OLAP (Pinot or ClickHouse)
This is the most commonly adopted pattern at companies with existing streaming infrastructure. The stack layers three independently managed systems, each with its own operational model.
Architecture Overview
Application Layer
|
Apache Kafka (ingestion + buffering)
|
Apache Flink (transformation + enrichment)
|
Apache Pinot or ClickHouse (OLAP serving)
|
Dashboard / BI Tools
Kafka buffers events and decouples producers from consumers. Flink reads from Kafka, applies business logic, and writes results to the OLAP store. Pinot or ClickHouse handles analytical queries from dashboards and APIs.
Infrastructure Costs
Kafka cluster (MSK on AWS)
At 100K events/sec averaging 1 KB each, you are moving roughly 100 MB/sec of data. AWS MSK with 6 broker nodes on kafka.m5.4xlarge instances provides the necessary throughput and replication:
| Component | Spec | Monthly Cost |
| MSK brokers | 6x kafka.m5.4xlarge | $2,808 |
| MSK storage | 10 TB EBS gp3 | $800 |
| Data transfer (inter-AZ) | ~200 GB/day | $180 |
| Kafka subtotal | $3,788/mo |
Flink cluster (self-managed on EC2)
Processing 100K events/sec across 20 jobs requires substantial compute. Each Flink TaskManager runs multiple parallel task slots. For high availability, you need JobManagers in active-standby configuration:
| Component | Spec | Monthly Cost |
| JobManagers (HA pair) | 2x m6i.2xlarge | $560 |
| TaskManagers | 12x r6i.4xlarge (16 vCPU, 128 GB) | $10,584 |
| ZooKeeper ensemble | 3x m6i.large | $210 |
| Local SSDs for RocksDB | 1 TB gp3 per TM x 12 | $960 |
| S3 checkpoint storage | ~2 TB | $46 |
| Flink subtotal | $12,360/mo |
Large r6i.4xlarge instances are needed because Flink's JVM heap and RocksDB state backend together consume significant memory. At 1 TB of total state, each TaskManager holds roughly 85 GB of state plus the JVM heap overhead.
OLAP layer (ClickHouse on EC2)
ClickHouse provides fast analytical query execution. A three-shard, two-replica cluster on memory-optimized instances handles the 50-user query concurrency requirement:
| Component | Spec | Monthly Cost |
| ClickHouse servers | 6x r6i.4xlarge | $5,292 |
| ClickHouse storage | 5 TB gp3 (replicated) | $800 |
| ZooKeeper (ClickHouse keeper) | 3x m6i.large | $210 |
| Load balancer | ALB | $50 |
| ClickHouse subtotal | $6,352/mo |
Stack 1 infrastructure total: $22,500/mo
Operational Costs
Operating three independent distributed systems requires significantly more engineering effort than any single cost line reveals.
Flink operations consume the most time. Checkpoint management alone is a persistent concern: tuning checkpoint intervals, sizing RocksDB block caches, managing savepoints for job upgrades, and debugging stuck checkpoints during traffic spikes. On-call incidents typically involve Flink, often at inconvenient hours. Plan for:
| Activity | Hours/Month | Cost at $150/hr |
| Flink checkpoint tuning and debugging | 30 | $4,500 |
| Kafka consumer lag monitoring and rebalancing | 10 | $1,500 |
| ClickHouse query optimization and schema changes | 15 | $2,250 |
| Capacity planning and cluster scaling | 12 | $1,800 |
| Incident response (on-call) | 20 | $3,000 |
| Upgrade management (all three systems) | 15 | $2,250 |
| Ops subtotal | 102 hrs | $15,300/mo |
The $150/hr figure reflects fully loaded engineering cost (salary + benefits + overhead) for senior engineers who understand distributed systems. Junior engineers cost less but take longer and produce more incidents.
Development Costs
Building and maintaining 20 streaming pipelines in this stack requires three distinct skill sets: Kafka administration, Flink Java or SQL, and ClickHouse SQL plus schema design.
Writing a new Flink job typically takes 2-4 weeks from design to production. The process involves connector configuration, serialization schema definitions, watermark strategy selection, state TTL decisions, unit testing with Flink's mini-cluster test harness, and deployment configuration. Schema changes at the ClickHouse layer must be coordinated with the Flink sink output format, creating cross-system dependency that slows iteration.
| Development Activity | Annual Cost |
| Initial pipeline development (20 jobs) | $180,000 |
| Ongoing pipeline modifications and new features | $60,000 |
| Cross-system schema change coordination | $24,000 |
| Onboarding new engineers (3-month ramp) | $18,000 |
| Dev subtotal | $282,000/yr ($23,500/mo) |
Stack 1 Total TCO
| Cost Category | Monthly |
| Infrastructure (Kafka + Flink + ClickHouse) | $22,500 |
| Operations | $15,300 |
| Development | $23,500 |
| Total | $61,300/mo |
| Annual | $735,600/yr |
Time to Value
From the decision to build to first dashboard in production: 12-16 weeks. The critical path runs through Kafka cluster setup (1-2 weeks), Flink cluster provisioning and job development (6-8 weeks), ClickHouse schema design and data loading (3-4 weeks), and integration testing across all three systems (2-3 weeks).
Stack 2: Kafka + RisingWave
This stack replaces both Flink and the OLAP layer with RisingWave, a streaming database that processes streaming data and serves the results over a PostgreSQL-compatible interface. Kafka remains as the ingestion layer because many organizations already have it or need its ecosystem of producers and connectors.
Architecture Overview
Application Layer
|
Apache Kafka (ingestion + buffering)
|
RisingWave (stream processing + serving)
|
Dashboard / BI Tools (via PostgreSQL wire protocol)
RisingWave reads from Kafka topics, applies transformations via SQL, maintains incrementally updated materialized views, and serves queries directly. There is no separate OLAP store, no sink connector between processing and serving, and no schema synchronization to manage.
Infrastructure Costs
Kafka cluster (same as Stack 1)
The Kafka layer stays identical since both stacks use it for ingestion:
| Component | Monthly Cost |
| MSK cluster (6 brokers, as above) | $3,788 |
RisingWave cluster (self-managed on Kubernetes or EC2)
RisingWave's cloud-native architecture separates compute from storage. Compute nodes are stateless - all 1 TB of persistent state lives in S3 at object storage prices rather than on provisioned SSDs.
For 100K events/sec with 20 materialized views serving 50 concurrent queries:
| Component | Spec | Monthly Cost |
| RisingWave compute nodes | 6x m6i.4xlarge (16 vCPU, 64 GB) | $3,888 |
| Meta node (coordinator) | 2x m6i.xlarge (HA) | $280 |
| Compactor nodes | 3x m6i.2xlarge | $840 |
| etcd (meta store) | 3x t3.medium | $91 |
| S3 state storage | ~1 TB | $23 |
| S3 request costs | $15 | |
| RisingWave subtotal | $5,137/mo |
The compute node sizing is materially smaller than the Flink equivalent for two reasons. First, RisingWave is written in Rust with no JVM overhead, delivering better throughput per CPU core. Second, RisingWave's shared state architecture allows materialized views with common sub-expressions to share intermediate computation, reducing duplicated state.
Stack 2 infrastructure total: $8,925/mo
That is 60% less than Stack 1's infrastructure costs. The difference is structural: no ClickHouse cluster, no expensive local SSDs, and a smaller compute footprint due to Rust efficiency.
Operational Costs
Replacing Flink plus ClickHouse with RisingWave eliminates entire categories of operational work:
- No checkpoint tuning: RisingWave continuously persists state to S3 using its Hummock storage engine. Recovery from failures takes seconds, not minutes or hours.
- No state backend selection: No RocksDB vs heap state backend decisions, no compaction tuning.
- No sink connector management: Results are available for query in the same system that processes them.
- No cross-system schema synchronization: Schema changes apply at the SQL layer.
- Familiar tooling: RisingWave exposes the PostgreSQL wire protocol, so your existing PostgreSQL monitoring tools, connection poolers (pgBouncer, pgcat), and operational runbooks work unchanged.
| Activity | Hours/Month | Cost at $150/hr |
| RisingWave monitoring and tuning | 12 | $1,800 |
| Kafka consumer lag monitoring | 8 | $1,200 |
| Incident response (on-call) | 8 | $1,200 |
| Upgrade management | 5 | $750 |
| Schema and pipeline management | 10 | $1,500 |
| Ops subtotal | 43 hrs | $6,450/mo |
The incident rate drops substantially because there are fewer failure modes. A stateless compute node failure in RisingWave recovers automatically in seconds - no engineer page required.
Development Costs
RisingWave uses PostgreSQL-compatible SQL for everything. Here is what the same revenue analytics pipeline looks like in SQL, verified against RisingWave 2.8.0:
-- Create materialized views - this is the entire pipeline definition
CREATE MATERIALIZED VIEW tco_revenue_by_region AS
SELECT
region,
COUNT(*) AS total_orders,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_order_value
FROM tco_orders
WHERE status = 'completed'
GROUP BY region;
Query the result immediately from any PostgreSQL client:
SELECT * FROM tco_revenue_by_region ORDER BY total_revenue DESC;
region | total_orders | total_revenue | avg_order_value
---------+--------------+---------------+-----------------
us-east | 3 | 598.98 | 199.66
apac | 1 | 149.99 | 149.99
eu-west | 2 | 134.50 | 67.25
(3 rows)
The view updates incrementally as new rows arrive from Kafka. No separate write-to-ClickHouse step, no sink connector, no schema sync. The data is live and queryable the moment the CREATE MATERIALIZED VIEW statement completes.
For a multi-stage funnel analysis tracking views, purchases, and returns:
CREATE MATERIALIZED VIEW tco_event_funnel AS
SELECT
region,
COUNT(*) FILTER (WHERE event_type = 'view') AS views,
COUNT(*) FILTER (WHERE event_type = 'purchase') AS purchases,
COUNT(*) FILTER (WHERE event_type = 'return') AS returns,
SUM(amount) FILTER (WHERE event_type = 'purchase') AS gross_revenue
FROM tco_events
GROUP BY region;
SELECT * FROM tco_event_funnel ORDER BY gross_revenue DESC NULLS LAST;
region | views | purchases | returns | gross_revenue
---------+-------+-----------+---------+---------------
us-east | 1 | 3 | 0 | 598.98
us-west | 1 | 1 | 0 | 299.00
apac | 0 | 1 | 1 | 149.99
eu-west | 0 | 2 | 0 | 134.50
(4 rows)
And a session-level page activity report:
CREATE MATERIALIZED VIEW tco_page_activity AS
SELECT
page,
COUNT(DISTINCT user_id) AS unique_users,
COUNT(*) AS total_views,
COUNT(DISTINCT session_id) AS sessions
FROM tco_page_views
GROUP BY page;
SELECT * FROM tco_page_activity ORDER BY unique_users DESC;
page | unique_users | total_views | sessions
--------------+--------------+-------------+----------
/pricing | 2 | 2 | 2
/get-started | 1 | 1 | 1
/blog | 1 | 1 | 1
/docs | 1 | 1 | 1
(4 rows)
All three examples above were verified against local RisingWave 2.8.0. Notice that these are complete, working analytics pipelines in 5-10 lines of SQL each. The Flink equivalent of each would be 200-400 lines of Java, a Maven build configuration, a separate sink connector job, and a ClickHouse DDL statement.
The talent pool for SQL-based development is dramatically broader than for Flink Java engineers. Any backend engineer or data analyst can write and maintain these pipelines. Onboarding takes days instead of months.
| Development Activity | Annual Cost |
| Initial pipeline development (20 jobs) | $48,000 |
| Ongoing modifications and new features | $24,000 |
| Schema and pipeline changes | $12,000 |
| Onboarding new engineers | $6,000 |
| Dev subtotal | $90,000/yr ($7,500/mo) |
Stack 2 Total TCO
| Cost Category | Monthly |
| Infrastructure (Kafka + RisingWave) | $8,925 |
| Operations | $6,450 |
| Development | $7,500 |
| Total | $22,875/mo |
| Annual | $274,500/yr |
Time to Value
From decision to first dashboard in production: 3-5 weeks. RisingWave connects to Kafka via a CREATE SOURCE statement. Materialized views define the entire processing and serving layer. Dashboard tools connect directly via the PostgreSQL wire protocol with no additional integration required.
Stack 3: Confluent Cloud (Managed Kafka + Flink + ClickHouse)
Confluent Cloud handles Kafka and Flink as a single managed service, eliminating cluster operations for those two layers. You still need a serving layer for analytical queries - ClickHouse Cloud is a natural complement.
Infrastructure Costs
Confluent Cloud (Kafka + Flink)
Confluent Cloud bills Kafka in CKUs (Confluent Units) and Flink in CFUs (Confluent Flink Units). At 100K events/sec:
| Component | Pricing | Monthly Cost |
| Kafka Standard cluster | 4 CKUs at $0.75/CKU-hr | $2,190 |
| Kafka storage | 10 TB | $900 |
| Flink compute pool | 40 CFUs at $0.21/CFU-hr | $6,131 |
| Kafka networking (ingress/egress) | ~200 MB/s sustained | $1,200 |
| Connectors | Schema Registry + 5 connectors | $800 |
| Confluent subtotal | $11,221/mo |
CFU consumption is notoriously difficult to predict. Complex joins and windowed aggregations over 1 TB of state consume more CFUs than simple projections. The 40-CFU estimate assumes moderately complex jobs. Teams frequently find their CFU consumption 1.3-2x higher than initial estimates after the first month of production traffic.
ClickHouse Cloud (analytical serving)
Confluent Flink writes results to ClickHouse Cloud, which serves dashboard queries:
| Component | Spec | Monthly Cost |
| ClickHouse Cloud | 3 nodes, 8 vCPU each | $3,600 |
| ClickHouse Cloud storage | 5 TB | $250 |
| Data transfer | Flink to ClickHouse | $300 |
| ClickHouse Cloud subtotal | $4,150/mo |
Stack 3 infrastructure total: $15,371/mo
Operational Costs
Confluent Cloud eliminates Kafka and Flink cluster administration. You do not manage broker rebalancing, ZooKeeper, checkpoints, or Flink job deployments. This is the genuine value proposition: your team focuses on Flink SQL logic, not infrastructure.
However, three operational areas remain:
Flink SQL debugging: Confluent's managed Flink surfaces a subset of Flink's internals. When a job behaves unexpectedly or throughput degrades, the visibility into execution plans and operator state is limited compared to self-managed Flink. Diagnosing a CFU spike requires reading Confluent's metrics, not direct JVM profiling.
ClickHouse operations: ClickHouse Cloud reduces operational burden significantly versus self-managed ClickHouse, but schema evolution, index tuning, and query optimization remain your responsibility.
Cross-system integration: The Flink-to-ClickHouse boundary still requires a sink connector, schema synchronization, and end-to-end monitoring. When something breaks, is it Flink, the connector, or ClickHouse?
| Activity | Hours/Month | Cost at $150/hr |
| Flink SQL development and debugging | 20 | $3,000 |
| ClickHouse query optimization | 12 | $1,800 |
| Cross-system monitoring and alerting | 10 | $1,500 |
| Incident response | 12 | $1,800 |
| Connector and schema management | 8 | $1,200 |
| Ops subtotal | 62 hrs | $9,300/mo |
Development Costs
Confluent Flink uses Flink SQL, which is more accessible than the Java DataStream API but less expressive than PostgreSQL-compatible SQL. The developer experience is web-console-first, with limited local development tooling. Writing Flink SQL locally and deploying to Confluent Cloud requires careful testing because behavioral differences between local Flink and Confluent's managed version can surface in production.
The ClickHouse integration layer adds coordination overhead. Every schema change in the streaming layer requires a corresponding ClickHouse migration - a manual step that must be coordinated across teams.
| Development Activity | Annual Cost |
| Initial pipeline development (20 jobs) | $120,000 |
| Ongoing modifications and new features | $48,000 |
| Cross-system schema coordination | $24,000 |
| Onboarding new engineers | $12,000 |
| Dev subtotal | $204,000/yr ($17,000/mo) |
Stack 3 Total TCO
| Cost Category | Monthly |
| Infrastructure (Confluent Cloud + ClickHouse Cloud) | $15,371 |
| Operations | $9,300 |
| Development | $17,000 |
| Total | $41,671/mo |
| Annual | $500,052/yr |
Time to Value
From decision to first dashboard in production: 6-10 weeks. Confluent Cloud accelerates the Kafka and Flink setup significantly, but the ClickHouse integration and Flink SQL development still require meaningful effort. The managed services reduce infrastructure time but not pipeline logic time.
Full TCO Comparison
Here is the side-by-side view across all three stacks for the 100K events/sec, 20-job reference workload:
| Cost Category | Kafka + Flink + ClickHouse | Kafka + RisingWave | Confluent Cloud + ClickHouse |
| Infrastructure | $22,500/mo | $8,925/mo | $15,371/mo |
| Operations | $15,300/mo | $6,450/mo | $9,300/mo |
| Development | $23,500/mo | $7,500/mo | $17,000/mo |
| Monthly Total | $61,300 | $22,875 | $41,671 |
| Annual Total | $735,600 | $274,500 | $500,052 |
| Time to Value | 12-16 weeks | 3-5 weeks | 6-10 weeks |
Where the Money Actually Goes
The infrastructure-only comparison misses the dominant cost drivers:
| Stack | Infrastructure % | Operations % | Development % |
| Kafka + Flink + ClickHouse | 37% | 25% | 38% |
| Kafka + RisingWave | 39% | 28% | 33% |
| Confluent Cloud + ClickHouse | 37% | 22% | 41% |
Across all three stacks, infrastructure represents roughly one-third of TCO. Development costs are similarly sized to infrastructure. Operations is the hidden middle layer that compounds as complexity increases.
The Kafka + RisingWave stack is 63% cheaper than the traditional Kafka + Flink + ClickHouse stack annually. Compared to Confluent Cloud, it is 45% cheaper. The savings are structural, not the result of cutting corners: fewer systems means less infrastructure, less operational surface area, and a faster, cheaper development loop.
Annual Savings by Switching
| Comparison | Annual Savings |
| Kafka + Flink + ClickHouse vs Kafka + RisingWave | $461,100/yr |
| Confluent Cloud + ClickHouse vs Kafka + RisingWave | $225,552/yr |
For a company running five such workloads (one per major product surface), the Kafka + RisingWave approach saves $2.3M per year versus the traditional stack.
Hidden Costs That Don't Appear in Spreadsheets
Specialist Hiring Premium
Flink engineers with production experience command significant salary premiums. A senior Flink engineer costs $180,000-$250,000 in total annual compensation in US tech markets as of 2026. The supply is constrained because Flink's complexity creates a steep learning curve and long time-to-competence.
SQL engineers with PostgreSQL experience are a commodity in comparison - not a criticism, but an economic reality. Hiring for a RisingWave-based stack draws from the database engineering talent pool, which is orders of magnitude larger and less expensive.
Lock-In and Switching Costs
Flink jobs written in Java or the DataStream API represent engineering capital that does not transfer to other platforms. An organization with 20 production Flink jobs has invested 20-40 engineer-months in implementation that is largely non-portable. This raises the switching cost for future migrations.
RisingWave uses standard PostgreSQL-compatible SQL. The query logic you write transfers to any SQL-compatible system. The intellectual investment is portable.
Confluent Cloud creates multi-layer lock-in: Confluent-specific Kafka extensions, the Confluent Schema Registry format, and Confluent's Flink SQL dialect. Migrating away means replacing both the message broker and the stream processor simultaneously.
Disaster Recovery Overhead
Flink's coupled compute-storage architecture requires DR replication at both layers. Replicating a 1 TB Flink state to a secondary region means running a parallel cluster to hold that state, roughly doubling infrastructure costs for DR.
RisingWave's state lives in S3, which offers 11 nines of durability and native cross-region replication. DR adds only storage replication costs (roughly 2x on a component that represents less than 0.5% of total TCO) without requiring a parallel compute cluster.
Choosing the Right Stack for Your Team
The right answer depends on factors beyond cost alone.
Choose Kafka + Flink + ClickHouse if:
- You have existing Flink engineers with production expertise and established operational playbooks.
- Your streaming jobs require Flink's DataStream API for custom operators that cannot be expressed in SQL.
- You need Flink's ecosystem of exactly-once connectors for specialized sinks (Iceberg, Delta Lake, custom systems).
- You run workloads that benefit from Flink's advanced windowing capabilities or complex event processing (CEP) patterns.
Choose Kafka + RisingWave if:
- Your streaming pipelines can be expressed in SQL (true for roughly 80% of real-world use cases).
- You want to minimize operational overhead and time-to-value.
- Your team has SQL expertise but limited Flink experience.
- You need direct SQL query access to streaming results without an additional OLAP layer.
- You want to run the system on RisingWave Cloud for fully managed operations, or self-host with minimal operational burden via the Kubernetes Helm chart.
Choose Confluent Cloud + ClickHouse if:
- You already use Confluent Cloud for Kafka and want to keep that relationship.
- You value the breadth of Confluent's connector ecosystem and Schema Registry.
- You are comfortable with the Confluent-specific extensions and accept the associated lock-in.
- Your team prefers managed services even at a premium cost.
FAQ
What is the total cost of ownership for a real-time analytics stack?
The total cost of ownership for a real-time analytics stack includes infrastructure (compute, storage, networking), operational labor (monitoring, tuning, incident response, upgrades), and development costs (building and maintaining pipelines). For a 100K events/sec workload, these three categories are roughly equal in size. Infrastructure accounts for about one-third of TCO, with operations and development making up the remaining two-thirds. The exact split depends on stack complexity: simpler architectures with fewer systems require less operational and development effort, which is where the Kafka + RisingWave approach saves the most.
How does Kafka + RisingWave compare to Confluent Cloud for real-time analytics costs?
Kafka + RisingWave costs roughly 45% less than Confluent Cloud plus a separate OLAP layer for equivalent workloads. Confluent charges separately for Kafka (CKU-hours) and Flink (CFU-hours), creating a double billing structure on top of the OLAP serving layer cost. RisingWave consolidates stream processing and query serving into a single system, eliminating the separate serving layer entirely and requiring only one compute bill. For a 100K events/sec workload, this difference is approximately $225,000 per year.
Is self-managed Kafka + Flink + ClickHouse ever cheaper than using RisingWave?
Only if your team already employs dedicated Flink and ClickHouse specialists who would otherwise be idle. In that case, the marginal cost of running the traditional stack is limited to infrastructure. But for a team building a new real-time analytics capability from scratch, the hiring costs, onboarding time, and ongoing operational overhead of three separate systems consistently make the traditional stack the most expensive option. The benchmark comparisons show that RisingWave's Rust-based engine also delivers comparable or better throughput per dollar versus Flink for SQL-expressible workloads.
How quickly can a team move from zero to production with each stack?
The Kafka + RisingWave stack reaches production in 3-5 weeks for a typical 100K events/sec deployment: Kafka setup (1 week), RisingWave deployment and source configuration (1 week), pipeline development via SQL (1-2 weeks), and dashboard integration via PostgreSQL wire protocol (1 week). Confluent Cloud reduces Kafka setup time but Flink SQL development and ClickHouse integration still require 6-10 weeks total. The traditional self-managed Kafka + Flink + ClickHouse stack requires 12-16 weeks because each system has its own setup, tuning, and integration requirements.
Conclusion
The real-time analytics stack decision is a TCO decision more than a technology decision. The infrastructure bill is visible and easy to budget. The engineering hours are harder to count and far more expensive in aggregate.
For a 100K events/sec workload in 2026:
- Kafka + Flink + ClickHouse costs approximately $735,600/year. Infrastructure is 37% of that. Operations and development are 63%.
- Confluent Cloud + ClickHouse costs approximately $500,000/year. Managed services cut infrastructure and ops costs but development complexity remains high.
- Kafka + RisingWave costs approximately $274,500/year. Consolidating processing and serving into one system creates structural savings across all three cost categories.
The time-to-value advantage compounds the cost advantage. A team that ships real-time analytics in 3-5 weeks instead of 12-16 weeks has 8-11 additional weeks of shipping capacity, which translates to features, customer value, and competitive position.
The cases where Flink remains the right answer are real but narrow: custom operator requirements, advanced CEP patterns, or large existing Flink codebases with established operational teams. For new projects where SQL can express the required logic, the TCO case for a streaming database like RisingWave is substantial.
Ready to calculate TCO for your specific workload? Try RisingWave Cloud free for 7 days with no credit card required. Sign up here.
Join our Slack community to ask questions, share your workload details, and get help sizing a deployment for your use case.

