Vetora logo
šŸ“ŠMessaging & Streaming

Kafka Architecture

Apache Kafka is a distributed, durable, high-throughput event streaming platform built around an append-only commit log. Its architecture of brokers, topics, partitions, and consumer groups enables millions of events per second with strong ordering guarantees per partition, configurable durability, and the ability to replay the full event history.

Overview

Apache Kafka was created at LinkedIn in 2010 to solve the problem of connecting all systems in a large organization to all data streams in real time. It was open-sourced in 2011 and became an Apache top-level project in 2012. Today it is the de facto standard for event streaming at scale.

Kafka's core abstraction is the **commit log**: an append-only, ordered, immutable sequence of records. A **topic** is a named log, and each topic is divided into one or more **partitions**. Each partition is an independent log stored on a single broker, with records assigned sequential offsets starting from 0. Partitions are the unit of parallelism: more partitions means more consumers can process data concurrently.

**Producers** write records to topics. Each record has an optional key; records with the same key are routed to the same partition (via hash(key) % num_partitions), ensuring ordering for related events. Records without a key are distributed round-robin across partitions.

**Consumer groups** coordinate consumption. Each partition in a topic is assigned to exactly one consumer within a group. If you have 12 partitions and 4 consumers in a group, each consumer handles 3 partitions. Adding a 5th consumer triggers rebalancing. If a consumer fails, its partitions are reassigned to surviving members. Multiple consumer groups can read the same topic independently, each maintaining its own offset -- this is how Kafka provides both fan-out and parallel processing.

**Brokers** store partitions on disk. Each partition has a leader broker (handles all reads and writes) and zero or more follower replicas on other brokers. The replication factor (typically 3) determines how many copies exist. Followers pull from the leader; once a follower is caught up, it joins the in-sync replica set (ISR). Producers can choose acknowledgment levels: acks=0 (fire-and-forget), acks=1 (leader only), or acks=all (all ISR replicas). With acks=all and min.insync.replicas=2, no acknowledged message is lost even if one broker dies.

Kafka's storage model is unique: records are retained for a configurable period (hours, days, or forever) regardless of whether they have been consumed. This enables replay: a new consumer group can start from offset 0 and reprocess the entire history, or a failing consumer can reset its offset and reprocess missed records. This replay capability is what distinguishes Kafka from traditional message queues.

Key Points
  • 1Topics are split into partitions -- the unit of parallelism and ordering. Records within a partition are strictly ordered by offset. Across partitions, there is no total ordering guarantee.
  • 2Partition keys route related records to the same partition, ensuring ordering for a given entity (e.g., all events for user_123 go to the same partition). Without a key, records are distributed round-robin.
  • 3Consumer groups assign each partition to exactly one consumer. Adding consumers (up to the partition count) increases parallelism. More consumers than partitions means some sit idle.
  • 4Replication (typically factor 3) with ISR (in-sync replicas) provides fault tolerance. With acks=all and min.insync.replicas=2, Kafka guarantees no data loss on single-broker failure.
  • 5Retention is time-based or size-based, not consumption-based. Records persist even after all consumers read them. This enables replay, reprocessing, and new consumer groups reading historical data.
  • 6KRaft mode (Kafka 3.3+) replaces ZooKeeper with an internal Raft-based metadata quorum, simplifying operations and removing ZooKeeper as a single point of failure.
Simple Example

Tracking Page Views

A website tracks page views by publishing each view to a Kafka topic 'page-views' with the user ID as the partition key. With 12 partitions, views for the same user always land in the same partition, preserving chronological order per user. Three consumer groups subscribe: (1) a real-time dashboard group with 4 consumers aggregating views per second, (2) an analytics group with 12 consumers (one per partition) writing to a data warehouse, and (3) a recommendations group with 6 consumers updating user interest profiles. Each group processes every view independently, at its own pace.

Real-World Examples

LinkedIn

LinkedIn runs over 100 Kafka clusters processing 7+ trillion messages per day across 7+ petabytes of data. Every user action (profile view, connection, post, message) flows through Kafka. The activity data feeds real-time notifications, the news feed, analytics, ad targeting, and anti-abuse systems. Each system runs as an independent consumer group.

Netflix

Netflix uses Kafka as the backbone of its data pipeline, processing over 1 trillion events per day. Every streaming session, UI interaction, and playback quality metric flows through Kafka. The data feeds real-time A/B testing, content recommendations, infrastructure monitoring, and billing. Netflix contributed the open-source Kafka consumer rebalance protocol improvements.

Walmart

Walmart uses Kafka for real-time inventory tracking across 4,700+ stores. When an item is sold at a register, scanned in a warehouse, or restocked on a shelf, the event is published to Kafka. Consumer groups power real-time inventory counts, replenishment triggers, pricing updates, and the online pickup-availability feature that shows customers what is in stock at their local store.

Trade-Offs
AspectDescription
Throughput vs LatencyKafka achieves massive throughput by batching records in the producer (linger.ms, batch.size) and writing sequentially to disk. Batching increases throughput but adds latency. For sub-millisecond latency requirements, a traditional message broker (RabbitMQ, Redis Streams) may be more appropriate. Kafka's sweet spot is high throughput with latency in the 5-50ms range.
Partition Count vs Operational ComplexityMore partitions means more parallelism but also more file handles, longer leader elections, and slower consumer group rebalancing. LinkedIn recommends ≤4,000 partitions per broker and ≤200,000 per cluster. Over-partitioning is a common mistake that increases end-to-end latency during rebalances.
Retention Duration vs Storage CostLong retention (days or weeks) enables replay and reprocessing but requires significant disk space. At 1 million messages/sec at 1KB each, one day of retention is ~86 GB per partition replica. Tiered storage (Kafka 3.6+) offloads old segments to object storage (S3), keeping hot data on broker SSDs and cold data in cheap storage.
Ordering vs ParallelismKafka guarantees ordering only within a partition. If you need total ordering across all events, you need a single partition -- which limits you to one consumer per group. The compromise is to partition by entity key (user_id, order_id), giving per-entity ordering with multi-partition parallelism.
Case Study

The LinkedIn Log: From Database Synchronization to Event Streaming Platform

Scenario

By 2010, LinkedIn had dozens of data systems (search index, social graph, recommendation engine, Hadoop warehouse) that all needed copies of the same data. Point-to-point pipelines created an O(N²) integration problem: every new system needed a pipeline from every data source. Changes to one system's schema would break multiple downstream consumers.

Solution

Jay Kreps, Neha Narkhede, and Jun Rao designed Kafka as a unified commit log that all systems could publish to and consume from. Each data source publishes change events to Kafka topics. Each consuming system runs a consumer group that reads from the relevant topics at its own pace. The log serves as the single source of truth, decoupling producers from consumers. LinkedIn published this concept as 'The Log' in a 2013 blog post that became one of the most influential distributed systems articles.

Outcome

LinkedIn reduced its integration complexity from O(N²) to O(N): each system connects to Kafka once. New systems are onboarded by creating a consumer group -- no changes to producers. Kafka grew from an internal tool to the most widely adopted event streaming platform, powering over 80% of Fortune 100 companies. The project spawned Confluent (valued at over $8 billion) and influenced the design of AWS Kinesis, Azure Event Hubs, and Google Pub/Sub.

Common Mistakes
  • ⚠Using too many partitions 'just in case.' Each partition consumes broker memory, file handles, and rebalancing time. Start with 6-12 partitions per topic and scale up based on measured throughput needs. You can add partitions later (but you cannot reduce them).
  • ⚠Ignoring consumer group rebalancing impact. When a consumer crashes or a new one joins, all consumers in the group pause during rebalancing (in the eager protocol). Use the cooperative sticky assignor (incremental cooperative rebalancing) to minimize disruption.
  • ⚠Setting acks=0 or acks=1 for critical data. With acks=0, the producer does not wait for any acknowledgment -- messages can be silently lost. With acks=1, a leader failure after acknowledging but before replicating loses the message. Use acks=all with min.insync.replicas=2 for durability.
  • ⚠Treating Kafka as a database. Kafka is a log, not a query engine. Compacted topics can store latest-value-per-key state, but range queries, joins, and ad-hoc queries should use a derived data store (Elasticsearch, PostgreSQL, Redis) fed by a Kafka consumer.
Related Concepts

See Kafka Architecture in action

Explore system design templates that use kafka architecture and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Simulate Kafka partitions processing click events at scale

Metrics to watch
consumer_lagthroughput_rpspartition_skew_pctp99_latency_ms
Run Simulation
Test Your Understanding

1What determines which partition a Kafka record is written to?

2What happens if a Kafka consumer group has more consumers than partitions?

Deeper Reading