GUIDE

Kafka Stream Processing with SQL

Process Apache Kafka streams using standard SQL instead of Java consumers. RisingWave ingests Kafka topics, transforms data with SQL, creates materialized views, and sinks results — all with exactly-once semantics.

SQL
No Java Required
Write Kafka processing pipelines in SQL instead of hundreds of lines of Java consumer code
Exactly-Once
End-to-End
Automatic exactly-once semantics from Kafka consumption through processing to sink output
Multi-Topic
SQL JOINs
Join multiple Kafka topics with database tables using standard SQL JOIN syntax
1-2 Weeks
Time to Production
Deploy Kafka processing pipelines in days, not the 6-12 weeks typical with Java consumers

Comparison

What Kafka stream processing patterns does RisingWave support?

RisingWave supports all common Kafka stream processing patterns — real-time aggregations, stream-stream joins, stream-table joins, windowed computations, event deduplication, and CDC enrichment — all expressed as SQL queries. Each pattern that would require hundreds of lines of Java becomes a single SQL statement.

CapabilityKafka StreamsksqlDBFlink SQLRisingWave
LanguageJava APIKSQLSQLPostgreSQL SQL
State ManagementRocksDB (manual)ManagedRocksDB (tunable)Automatic
Exactly-OnceManual configYesManual configAutomatic
Query ServingNoPull queriesNoFull PostgreSQL
Multi-Source JoinsKafka onlyKafka onlyMulti-sourceMulti-source
Operational CostHighMediumHighLow
Time to Production6-12 weeks2-4 weeks4-8 weeks1-2 weeks
  • Real-time aggregations: COUNT, SUM, AVG over tumbling, hopping, and session windows
  • Stream-stream joins: join two Kafka topics on matching keys with time-bounded windows
  • Stream-table enrichment: join Kafka events with dimension tables from PostgreSQL CDC
  • Event deduplication: use materialized views with DISTINCT ON to deduplicate events
  • Pattern detection: identify sequences of events using window functions and self-joins
  • Fan-out processing: one Kafka source feeding multiple materialized views for different consumers

How It Works

How does RisingWave process Kafka streams with SQL?

RisingWave connects to Kafka topics as streaming sources, letting you write standard SQL to transform, aggregate, join, and window your event data. Results are maintained as materialized views that update incrementally with each new Kafka message. No Java code, no consumer management, no state store configuration — just SQL.

Native Kafka Connector

CREATE SOURCE connects to any Kafka cluster. Supports Avro, Protobuf, JSON, and CSV formats with Schema Registry integration.

SQL Transformations

Filter, map, aggregate, and join Kafka streams using standard SQL. Window functions, CTEs, and subqueries all supported.

Automatic State Management

No RocksDB tuning or state store configuration. RisingWave manages all intermediate state with automatic checkpointing.

Kafka Sink Output

Write processed results back to Kafka topics, or sink to PostgreSQL, Elasticsearch, Redis, S3, and 20+ destinations.

The Problem

Why is processing Kafka streams with Java so complex?

Processing Kafka streams with Java requires writing consumer applications, managing offsets, handling deserialization, implementing windowing logic, managing state stores, and dealing with rebalancing. Even simple aggregations demand hundreds of lines of code, specialized Kafka expertise, and weeks of development before reaching production.

  • Write and maintain Java consumer applications with complex configuration for every processing task
  • Manually manage consumer offsets, commit strategies, and exactly-once transaction boundaries
  • Implement windowing, aggregation, and join logic from scratch using low-level APIs
  • Configure and tune RocksDB state stores for stateful operations in Kafka Streams
  • Handle consumer group rebalancing, partition assignment, and backpressure manually
  • Hire Java engineers with Kafka expertise — a rare and expensive skill combination

Frequently Asked Questions

Can RisingWave consume from multiple Kafka topics simultaneously?
Does RisingWave support Kafka exactly-once semantics?
How does RisingWave handle Kafka schema evolution?
Can I sink processed results back to Kafka?

Ready to process Kafka with SQL?

Start building Kafka processing pipelines with SQL in minutes.

Process Kafka with SQL
Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.