Apache Flink Explained: Architecture and Key Concepts
Kafka Connect is a framework for streaming data between Apache Kafka and external systems using pre-built connectors. Source connectors pull data into Kafka; sink connectors push data out of Kafka.
Architecture
External Source → Source Connector → Kafka Topic → Sink Connector → External Destination
Connector Types
| Type | Direction | Examples |
| Source | External → Kafka | Debezium (CDC), JDBC, S3, MongoDB |
| Sink | Kafka → External | Elasticsearch, S3, JDBC, Iceberg |
Key Components
- Worker: JVM process running connectors
- Task: Unit of parallelism within a connector
- Converter: Serialization format (Avro, JSON, Protobuf)
- Transform (SMT): Lightweight in-flight transformations
When to Use Kafka Connect vs Alternatives
| Approach | When to Use |
| Kafka Connect | Need broad connector ecosystem, Kafka as central bus |
| RisingWave | Need CDC + processing + serving in one system |
| Flink CDC | Need CDC + complex processing |
Frequently Asked Questions
Is Kafka Connect free?
The framework and many connectors are open source. Confluent offers 120+ managed connectors on Confluent Cloud. Some enterprise connectors require a Confluent subscription.
Can RisingWave replace Kafka Connect?
For PostgreSQL and MySQL CDC, yes — RisingWave's native CDC eliminates the need for Kafka Connect + Debezium. For other sources (MongoDB, S3, SaaS apps), Kafka Connect remains necessary.

