RisingWave + Redpanda: A High-Performance Kafka-Compatible Setup

TL;DR

Redpanda is a Kafka-compatible streaming broker written in C++ that eliminates the need for ZooKeeper and the JVM. Because RisingWave connects to event brokers using the Kafka protocol, you can drop in Redpanda without changing a single line of your RisingWave SQL. The result is a leaner, easier-to-operate streaming stack with strong real-time analytics on top.

What Is Redpanda and Why Consider It?

Apache Kafka is the dominant event streaming platform, but its operational overhead is real. Traditional Kafka deployments require ZooKeeper (or the newer KRaft mode), a JVM tuning discipline, and careful management of broker replication state. For many teams, that complexity is the price of admission to reliable event streaming.

Redpanda takes a different approach. Built entirely in C++, Redpanda runs as a single binary per broker with no external dependencies. It implements the Kafka API wire protocol, which means any application or tool that speaks Kafka speaks Redpanda automatically. There is no separate metadata service: Redpanda embeds its own Raft-based consensus internally.

Key differences from Apache Kafka:

No ZooKeeper / no JVM -- Redpanda is a self-contained binary. Kafka 4.0 (released March 2025) finally removed ZooKeeper via KRaft, but most production deployments still run older versions.
Thread-per-core architecture -- Redpanda pins I/O threads to CPU cores and bypasses the kernel page cache for predictable, low-jitter latency.
Kafka API compatibility -- Producers, consumers, and connectors written for Kafka work with Redpanda without code changes.
Operational simplicity -- Fewer moving parts means smaller teams can manage the cluster.

Independent benchmarks show nuanced results. Redpanda can deliver lower tail latencies under specific workloads, particularly at low-to-medium producer counts. At high producer concurrency or when brokers approach storage limits, Apache Kafka's page-cache design often matches or exceeds Redpanda throughput. The right choice depends on your workload profile. However, for teams that prioritize operational simplicity alongside strong performance, Redpanda is a compelling option.

RisingWave + Redpanda Architecture

RisingWave is a streaming database that ingests data from event brokers and answers queries using continuously updated materialized views. Because RisingWave connects to sources over the Kafka wire protocol, plugging in Redpanda requires no code changes at the RisingWave layer.

┌─────────────────────────────────────────────────────────────┐
│                      Your Applications                       │
│         (producers: microservices, CDC, IoT sensors)        │
└────────────────────────────┬────────────────────────────────┘
                             │  Kafka protocol (port 9092)
                             ▼
┌─────────────────────────────────────────────────────────────┐
│                        Redpanda                              │
│        (Kafka-compatible broker, no ZooKeeper/JVM)          │
│                  Topics: events, orders, logs               │
└────────────────────────────┬────────────────────────────────┘
                             │  Kafka connector (same syntax)
                             ▼
┌─────────────────────────────────────────────────────────────┐
│                       RisingWave                             │
│   Sources → Materialized Views → Serving Layer (SQL)        │
│   • Real-time aggregations                                   │
│   • Joins across streams                                     │
│   • Windowed analytics                                       │
└─────────────────────────────────────────────────────────────┘
                             │
                             ▼
            BI Tools / Dashboards / APIs

Data flows from your producers into Redpanda topics. RisingWave connects to those topics using the same connector = 'kafka' syntax it uses for Apache Kafka. From there, RisingWave maintains materialized views that your dashboards and applications query in real time.

Setting It Up

Step 1: Start Redpanda

The fastest way to run Redpanda locally is with Docker:

docker run -d --name redpanda \
  -p 9092:9092 \
  -p 9644:9644 \
  docker.redpanda.com/redpandadata/redpanda:latest \
  redpanda start \
  --smp 1 \
  --memory 1G \
  --overprovisioned \
  --kafka-addr 0.0.0.0:9092 \
  --advertise-kafka-addr localhost:9092

Port 9092 is the standard Kafka API port. Port 9644 is the Redpanda Admin API for health checks and configuration.

Create a topic:

docker exec redpanda rpk topic create events --partitions 4 --replicas 1

Step 2: Create a RisingWave Source

Because Redpanda implements the Kafka protocol, the RisingWave CREATE SOURCE statement is identical to what you would use with Apache Kafka. Point properties.bootstrap.server at your Redpanda broker:

CREATE SOURCE redpanda_events (
    id        BIGINT,
    user_id   BIGINT,
    event_type VARCHAR,
    ts        TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'events',
    properties.bootstrap.server = 'localhost:9092',
    scan.startup.mode = 'earliest'
)
FORMAT PLAIN ENCODE JSON;

No Redpanda-specific connector is needed. The connector = 'kafka' value works because Redpanda speaks the same protocol.

Step 3: Create Materialized Views

Once the source is defined, you write standard SQL to build continuously maintained views:

-- Count events by type in real time
CREATE MATERIALIZED VIEW event_counts AS
SELECT
    event_type,
    COUNT(*)          AS total,
    COUNT(DISTINCT user_id) AS unique_users
FROM redpanda_events
GROUP BY event_type;

-- Sliding 5-minute window of recent activity
CREATE MATERIALIZED VIEW recent_events AS
SELECT
    event_type,
    COUNT(*) AS count_5min
FROM redpanda_events
WHERE ts > NOW() - INTERVAL '5 minutes'
GROUP BY event_type;

Query these views exactly like regular database tables:

SELECT * FROM event_counts ORDER BY total DESC;
SELECT * FROM recent_events;

RisingWave keeps these results up to date incrementally as new events arrive from Redpanda -- there is no polling or batch refresh cycle.

Verified SQL (RisingWave localhost)

The following SQL was verified against RisingWave 2.8.0:

-- Table simulating Kafka/Redpanda message contents
CREATE TABLE events (
    id         BIGINT,
    user_id    BIGINT,
    event_type VARCHAR,
    ts         TIMESTAMPTZ
);

INSERT INTO events VALUES
    (1, 100, 'click', NOW()),
    (2, 101, 'purchase', NOW());

CREATE MATERIALIZED VIEW event_counts AS
SELECT event_type, COUNT(*) FROM events GROUP BY event_type;

SELECT * FROM event_counts;
-- Returns:
--  event_type | count
-- ------------+-------
--  click      |     1
--  purchase   |     1

Performance Considerations

Where Redpanda Shines

Redpanda's thread-per-core model avoids context switching overhead at the cost of predictable CPU pinning. In benchmark scenarios with moderate producer counts (under ~20 parallel producers), Redpanda consistently delivers lower p99 and p999 tail latencies than Kafka on equivalent hardware. If your use case involves latency-sensitive pipelines -- think fraud detection, real-time pricing, or live dashboards -- Redpanda's architecture aligns well.

Where Kafka Holds Its Own

For very high producer concurrency (50+ parallel producers) or workloads that sustain maximum throughput for many hours, Kafka's Linux page cache design has a proven track record. Kafka 4.0's KRaft mode also removes the ZooKeeper dependency, narrowing one of Redpanda's operational advantages.

Sizing the RisingWave Layer

Regardless of which broker you choose, the compute bottleneck typically shifts to RisingWave once you build complex materialized views. Some sizing guidance:

Workload	RisingWave configuration
Simple aggregations (COUNT, SUM)	4 CPU, 8 GB RAM
Multi-stream joins	8 CPU, 16 GB RAM
High-cardinality windowed views	16 CPU, 32 GB RAM

Start small and use SELECT * FROM rw_catalog.rw_streaming_jobs to inspect running jobs and their parallelism settings.

End-to-End Latency Budget

A typical pipeline looks like this:

Producer → Redpanda:    1-5 ms   (broker write latency)
Redpanda → RisingWave:  5-20 ms  (consumer fetch interval)
RisingWave processing:  1-10 ms  (incremental computation)
Query response:         1-5 ms   (serving layer)
─────────────────────────────────
Total end-to-end:       ~10-40 ms

For use cases requiring sub-10 ms end-to-end, tune Redpanda's fetch.min.bytes=1 and fetch.max.wait.ms=0 alongside RisingWave's properties.fetch.wait.max.ms source option.

Migrating from Kafka to Redpanda

If you already run RisingWave with Apache Kafka, migrating the broker layer to Redpanda is straightforward:

Deploy Redpanda alongside your existing Kafka cluster.
Mirror topics using MirrorMaker 2 or Redpanda's built-in remote read replica feature.
Update the bootstrap server in your RisingWave CREATE SOURCE statements to point at Redpanda.
Validate outputs by comparing materialized view results from both sources during a cutover window.
Decommission Kafka once traffic is fully migrated.

Because the Kafka protocol is identical, step 3 is the only change required in your RisingWave layer.

Key Takeaways

Redpanda is fully compatible with RisingWave via the standard Kafka connector.
You gain operational simplicity (no ZooKeeper, no JVM) with no changes to your streaming SQL.
Performance trade-offs favor Redpanda at low-to-medium concurrency and Kafka at extreme throughput; benchmark your specific workload.
Migrating from Kafka to Redpanda is a broker-layer change -- RisingWave SQL stays untouched.

FAQ

Does RisingWave have a Redpanda-specific connector?

No -- and you do not need one. Redpanda implements the Kafka API, so connector = 'kafka' in RisingWave works with any Redpanda deployment.

Can I use Redpanda Schema Registry with RisingWave?

Yes. Redpanda ships a built-in Schema Registry compatible with the Confluent Schema Registry API. Set schema.registry = 'http://redpanda-host:8081' in your RisingWave source definition when using Avro or Protobuf encoding.

What Redpanda features does RisingWave not use?

RisingWave connects as a standard Kafka consumer. Redpanda-specific features like the Admin API, WASM data transforms, or Tiered Storage are managed at the Redpanda layer and are transparent to RisingWave.

Is Redpanda production-ready?

Yes. Redpanda is used in production at many companies and offers a managed cloud service (Redpanda Cloud) for teams that want a fully hosted broker without operational overhead.

Can I run both Kafka and Redpanda as sources in the same RisingWave cluster?

Yes. RisingWave allows multiple sources defined independently. You can have some sources pointing at Kafka brokers and others pointing at Redpanda brokers simultaneously.

What to Read Next

Ready to try it? Start RisingWave for free or follow the quickstart guide to have your first materialized view running in minutes.