Apache Avro vs Protobuf vs JSON: Serialization Formats for Streaming

Kafka partitioning determines how events are distributed across partitions within a topic. The partition key controls which partition receives each event. Choosing the right key affects ordering, throughput, and even distribution.

How Partitioning Works

Producer → hash(key) % num_partitions → Partition N → Consumer

Events with the same key always go to the same partition → guaranteed ordering for that key.

Partitioning Strategies

Strategy	Key	Ordering	Distribution	Use Case
Entity key	user_id, order_id	Per-entity	Depends on cardinality	Most common
Round-robin	null (no key)	None	Even	Logs, metrics
Time-based	timestamp bucket	Per-time-window	Even if uniform	Time series
Composite	region + user_id	Per-entity-region	More even	Multi-tenant

Best Practices

Choose high-cardinality keys — user_id (millions of values) distributes evenly; country (200 values) may cause hot partitions
Never use a low-cardinality key as the sole partition key
Match partitions to consumers — 12 partitions → max 12 consumers in a group
Over-partition initially — easier to have too many than too few

Frequently Asked Questions

How many Kafka partitions should I have?

Start with 12-24 partitions for moderate workloads. Scale based on consumer parallelism and throughput requirements. Each partition is consumed by at most one consumer in a consumer group.

What happens if I change the partition count?

Adding partitions doesn't redistribute existing data. New events may route differently. This can break ordering guarantees for existing keys. Avoid changing partition counts in production when ordering matters.

Kafka Partitioning Strategies: How to Choose the Right Partition Key