Running Kafka Without ZooKeeper: A Step-by-Step Guide

Apache Kafka has long relied on ZooKeeper for managing distributed systems. However, the Kafka ecosystem is evolving towards a ZooKeeper-less architecture known as KRaft (Kafka Raft). This shift aims to simplify Kafka management and improve scalability. Running Kafka without ZooKeeper eliminates the need for an external coordination system. KRaft mode leverages the Raft consensus algorithm for leader election and metadata replication. Kafka 3.5 officially deprecated ZooKeeper, making KRaft the new standard for metadata management.

Kafka without ZooKeeper: Background Information

What is Kafka?

Apache Kafka serves as a distributed event streaming platform. Kafka enables the building of real-time data pipelines and streaming applications. Kafka's architecture consists of producers, consumers, topics, partitions, and brokers.

Overview of Kafka's architecture

Kafka's architecture revolves around a distributed system. Producers send records to topics. Consumers read records from topics. Topics are divided into partitions for scalability and fault tolerance. Brokers manage the storage and retrieval of records. Kafka clusters consist of multiple brokers working together.

Traditional role of ZooKeeper in Kafka

ZooKeeper has historically played a crucial role in Kafka's architecture. ZooKeeper managed broker metadata, including configuration updates and cluster membership. ZooKeeper also handled leader election for partitions. ZooKeeper ensured high availability by maintaining the state of the Kafka cluster.

Introduction to KRaft Mode

KRaft mode represents a significant shift in Kafka's architecture. KRaft eliminates the need for ZooKeeper. KRaft uses the Raft consensus algorithm for managing metadata and leader election.

What is KRaft?

KRaft stands for Kafka Raft. KRaft introduces a new way of handling metadata management within Kafka. KRaft integrates the Raft consensus protocol directly into Kafka brokers. KRaft allows Kafka to operate independently of ZooKeeper.

Benefits of KRaft over ZooKeeper

KRaft offers several advantages over ZooKeeper. KRaft simplifies Kafka's deployment and management. KRaft reduces operational complexity by eliminating an external coordination system. KRaft enhances scalability by distributing metadata management across Kafka brokers. KRaft improves fault tolerance with the Raft consensus algorithm.

Current state and future of KRaft

KRaft mode has been officially introduced in Kafka 3.5. KRaft mode is now the standard for new Kafka deployments. KRaft mode continues to evolve with ongoing improvements and optimizations. The Kafka community actively works on enhancing KRaft's capabilities. The future of Kafka without ZooKeeper looks promising with KRaft leading the way.

Prerequisites for Kafka without ZooKeeper

System Requirements

Hardware requirements

Running Apache Kafka without ZooKeeper requires specific hardware configurations. Ensure that each Kafka broker has sufficient CPU, memory, and disk space. A minimum of 4 CPU cores and 8 GB of RAM per broker is recommended. Disk space should be ample to store logs and data, with SSDs preferred for better performance. Network bandwidth must support high throughput, with at least 1 Gbps network interfaces.

Software requirements

Kafka in KRaft mode needs compatible software environments. Use a Linux-based operating system for optimal performance. Kafka requires Java Development Kit (JDK) version 11 or higher. Ensure that the system has the latest updates and patches installed. Kafka also requires proper configuration of system limits, such as file descriptors and process limits.

Preparing Your Environment

Installing Java

Java installation is crucial for running Kafka. Follow these steps to install Java:

Download the JDK from the official Oracle website or an open-source alternative like OpenJDK.
Install the JDK by following the instructions for your specific operating system.
Verify the installation by running the command java -version in the terminal. Ensure that the output shows the correct version of Java.

Setting up necessary directories

Proper directory setup ensures organized Kafka operations. Create directories for Kafka logs and data storage. Follow these steps:

Create a directory for Kafka binaries, for example, /opt/kafka.
Create a directory for Kafka logs, for example, /var/log/kafka.
Create a directory for Kafka data, for example, /var/lib/kafka.

Ensure that these directories have appropriate permissions for the Kafka user. Proper directory setup helps in managing Kafka efficiently and avoiding potential issues.

Step-by-Step Setup for Kafka without ZooKeeper

Downloading and Installing Kafka

Downloading Kafka binaries

To begin, download the Kafka binaries from the official Apache Kafka website. Choose the latest version that supports KRaft mode, which is available starting from Kafka 2.8.0. Ensure that the downloaded file matches the system architecture and operating system.

Extracting and setting up Kafka

After downloading, extract the Kafka binaries to the desired directory. Use the following command on a Linux-based system:

tar -xzf kafka_2.13-3.5.0.tgz -C /opt/kafka

This command extracts the Kafka files into the /opt/kafka directory. Verify the extraction by listing the contents of the directory. Ensure that the necessary files and folders are present.

Configuring Kafka for KRaft Mode

Modifying server.properties file

Configuration of Kafka for KRaft mode requires modifications to the server.properties file. Open the file located in the config directory within the Kafka installation path. Add or modify the following properties:

process.roles=broker,controller
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://localhost:9092,CONTROLLER://localhost:9093
log.dirs=/var/lib/kafka

These settings configure the Kafka broker to operate in KRaft mode. The process.roles property specifies the roles of the node. The controller.quorum.voters property defines the quorum for the controller election.

Setting up KRaft-specific configurations

Additional KRaft-specific configurations ensure proper operation. Add the following properties to the server.properties file:

metadata.log.dir=/var/lib/kafka/metadata
num.replica.fetchers=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

These properties configure the metadata storage and replication settings. The metadata.log.dir property specifies the directory for storing metadata logs. The replication factor properties ensure data redundancy and fault tolerance.

Initializing the KRaft Metadata

Running the KRaft initialization command

Initialization of the KRaft metadata is crucial for the setup. Run the following command to initialize the metadata:

bin/kafka-storage.sh format -t <UUID> -c config/server.properties

Replace <UUID> with a unique identifier for the cluster. This command formats the metadata storage and prepares the Kafka broker for operation.

Verifying the metadata setup

Verification of the metadata setup ensures successful initialization. Check the logs in the metadata.log.dir directory. Look for entries indicating successful formatting and initialization. Start the Kafka broker using the following command:

bin/kafka-server-start.sh config/server.properties

Monitor the logs to ensure that the broker starts without errors. The broker should now operate in KRaft mode, managing metadata and leader election internally.

Starting Kafka in KRaft Mode

Starting the Kafka broker

To start the Kafka broker in KRaft mode, execute the following command:

bin/kafka-server-start.sh config/server.properties

This command initiates the Kafka broker using the configurations specified in the server.properties file. Ensure that the terminal displays log entries indicating the broker's startup process. The logs should show messages about loading configurations, initializing components, and starting services.

Checking the broker status

Verifying the broker status is essential to ensure proper operation. Use the following command to check the broker status:

bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092

This command queries the broker for its API versions, confirming that the broker is running and responsive. The output should list supported API versions and indicate that the broker is operational.

Additionally, monitor the Kafka logs located in the log.dirs directory specified in the server.properties file. Look for entries indicating successful startup and readiness to handle requests. The logs should not contain error messages or warnings that could indicate configuration issues.

Running Kafka without ZooKeeper requires careful monitoring of the broker's health and performance. Regularly check the logs and use Kafka's built-in tools to ensure smooth operation. The transition to KRaft mode simplifies Kafka management by eliminating the need for an external coordination system, enhancing scalability and fault tolerance.

Testing and Validation of Kafka without ZooKeeper

Creating and Managing Topics

Creating a new topic

Creating topics in Kafka without ZooKeeper involves using the kafka-topics.sh script. Execute the following command to create a new topic:

bin/kafka-topics.sh --create --topic my-new-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

This command creates a topic named my-new-topic with three partitions and a replication factor of one. Ensure that the terminal displays a confirmation message indicating successful topic creation.

Listing and describing topics

Listing and describing topics help verify the setup. Use the following command to list all topics:

bin/kafka-topics.sh --list --bootstrap-server localhost:9092

This command outputs a list of all existing topics. To describe a specific topic, use the following command:

bin/kafka-topics.sh --describe --topic my-new-topic --bootstrap-server localhost:9092

This command provides detailed information about the specified topic, including partition details and leader assignments.

Producing and Consuming Messages

Producing messages to a topic

Producing messages to a topic involves using the kafka-console-producer.sh script. Execute the following command to start a producer:

bin/kafka-console-producer.sh --topic my-new-topic --bootstrap-server localhost:9092

After running the command, type messages into the terminal. Each line represents a separate message sent to the topic. Press Enter to send each message.

Consuming messages from a topic

Consuming messages from a topic requires the kafka-console-consumer.sh script. Use the following command to start a consumer:

bin/kafka-console-consumer.sh --topic my-new-topic --from-beginning --bootstrap-server localhost:9092

This command starts a consumer that reads messages from the beginning of the topic. The terminal displays each consumed message, verifying that the producer and consumer operate correctly.

Monitoring and Troubleshooting

Using Kafka logs for troubleshooting

Monitoring Kafka logs is essential for identifying issues. Logs provide insights into broker operations and potential problems. Access the logs by navigating to the directory specified in the log.dirs property. Use the following command to view the logs:

tail -f /var/log/kafka/server.log

This command continuously displays new log entries, helping identify errors or warnings. Look for messages related to broker startup, configuration issues, or network problems.

Common issues and solutions

Common issues may arise during the setup and operation of Kafka without ZooKeeper. Here are some typical problems and their solutions:

Broker not starting: Verify the server.properties file for correct configurations. Check the logs for error messages indicating missing or incorrect properties.
Topic creation failure: Ensure that the bootstrap-server address is correct. Verify network connectivity between the client and broker.
Message production or consumption issues: Confirm that the topic exists and has the correct configurations. Check the logs for any connectivity or partition assignment errors.

Regular monitoring and timely troubleshooting ensure smooth operation of Kafka without ZooKeeper. Properly configured logs and tools help maintain system health and performance.

The guide detailed the process of running Kafka without ZooKeeper, covering system requirements, environment preparation, Kafka installation, configuration for KRaft mode, metadata initialization, and broker startup. Running Kafka in KRaft mode offers several benefits, including simplified deployment, improved scalability, and enhanced fault tolerance. The elimination of ZooKeeper reduces operational complexity, making Kafka management more straightforward. Readers are encouraged to explore further and provide feedback on their experiences with Kafka in KRaft mode.