1When a new node is added to a consistent hashing ring with K total keys and N existing nodes, approximately how many keys need to be redistributed?
Consistent hashing maps both keys and nodes onto a hash ring, assigning each key to the nearest node in the clockwise direction. When a node is added or removed, only K/N keys (where K is total keys and N is total nodes) need to be redistributed -- compared to nearly all keys with traditional mod-N hashing. This makes consistent hashing essential for distributed caches, CDNs, and databases that need elastic scaling.
Consistent hashing was introduced by David Karger et al. in their 1997 paper as a solution for distributing web cache content across a changing set of servers. The core insight is that both keys and nodes are mapped onto the same circular hash space (a ring from 0 to 2^m - 1), and each key is assigned to the first node encountered when walking clockwise from the key's position on the ring. When a node is added, it takes over a portion of the ring from its clockwise neighbor, and only the keys in that portion need to be moved. When a node is removed, its keys fall to the next clockwise node. In either case, only K/N keys are redistributed on average (where K is total keys and N is total nodes), compared to the nearly complete reshuffling that hash(key) mod N requires when N changes.
The naive implementation of consistent hashing has a significant problem: with a small number of physical nodes, the ring segments are likely to be uneven, causing some nodes to own far more keys than others. Virtual nodes (vnodes) solve this by assigning each physical node multiple positions on the ring -- typically 100 to 256 virtual nodes per physical node. Each virtual node is a point on the ring that maps back to its physical host. With enough virtual nodes, the statistical variation in segment sizes shrinks dramatically, producing near-uniform distribution. The Dynamo paper (2007) used 150 virtual nodes per physical node, and Cassandra defaults to 256.
Several important variants and alternatives have emerged since the original paper. Jump consistent hashing (Lamping and Veach, 2014 from Google) uses zero memory -- it computes the assigned bucket using only arithmetic, with perfect distribution across N buckets. However, it only supports appending or removing the last bucket, making it unsuitable for arbitrary node failures. Bounded-load consistent hashing (Google, 2017) addresses load imbalance by capping the maximum load any single node can handle at (1 + epsilon) times the average load, redirecting overflow to the next node on the ring. Rendezvous hashing (also called Highest Random Weight or HRW hashing) takes a different approach entirely: for each key, it computes a score for every node using hash(key, node_id), and assigns the key to the node with the highest score. This achieves similar redistribution properties to consistent hashing without needing a ring abstraction.
Consistent hashing is a foundational building block for modern distributed infrastructure. CDNs like Akamai (the company co-founded by Karger himself) use it to route requests to cache servers, minimizing cache misses when servers are added or removed. Distributed databases like DynamoDB and Cassandra use it for data partitioning. Load balancers use it for sticky session routing. Discord uses it to assign guilds (servers) to backend processes. In every case, the value proposition is the same: the ability to add and remove nodes from a cluster with minimal data movement, enabling elastic scaling without the thundering-herd cache-miss storms that plague mod-N approaches.
The Circular Parking Lot
Imagine a circular parking lot with spots numbered 0-359 (like degrees on a compass). Four attendants are stationed at positions 0, 90, 180, and 270. When a car arrives, the driver is given a ticket with a random number from 0-359. The driver parks at the first attendant's section they reach by driving clockwise from their number. Car #45 drives to the attendant at 90. Car #200 drives to the attendant at 270. Now a fifth attendant is added at position 135. Only cars parked between 90 and 135 need to move to the new attendant -- everyone else stays put. If attendant 90 goes on break, only their cars (parked between 0 and 90) move to attendant 135. The lot keeps working smoothly with minimal car shuffling.
Amazon DynamoDB (Dynamo Paper, 2007)
The original Dynamo system used consistent hashing with virtual nodes to partition data across a cluster of commodity servers. Each physical node was assigned 150 virtual nodes (tokens) on the hash ring. When a node joined the cluster, it received tokens from its ring neighbors, moving only the affected key ranges. Dynamo's 'preference list' extended consistent hashing by replicating each key to the next N distinct physical nodes on the ring (skipping virtual nodes belonging to the same physical host), ensuring durability even when adjacent ring positions mapped to the same machine.
Akamai CDN
Akamai was co-founded by David Karger, one of the authors of the original consistent hashing paper, specifically to apply this technique to content delivery. When a user requests a cached object, Akamai hashes the URL to a position on the ring and routes the request to the nearest cache server. When a server is added or removed (due to scaling or failure), only the objects that would map to the changed region of the ring are invalidated -- all other cached objects remain valid, preventing the cache stampede that would occur if all servers' caches were simultaneously invalidated.
Discord
Discord uses consistent hashing to assign guilds (servers, in Discord terminology) to backend gateway processes. Each guild is hashed to a position on the ring, and the owning gateway process handles all real-time events for that guild. When Discord scales its gateway fleet up or down, consistent hashing ensures that only a fraction of guilds are reassigned to new processes, minimizing the disruption of real-time voice and chat connections. For very large guilds (millions of members), Discord further shards within the guild using member-level consistent hashing.
| Aspect | Description |
|---|---|
| Minimal Redistribution vs Metadata Overhead | Consistent hashing minimizes key movement during topology changes (K/N vs nearly all keys), but the ring metadata (token assignments, virtual node mappings) must be stored and distributed to all clients and nodes. In a 100-node cluster with 256 vnodes each, the ring contains 25,600 entries that must be synchronized via gossip or a coordination service. |
| Virtual Nodes: Uniformity vs Complexity | More virtual nodes produce more uniform distribution and faster rebalancing, but increase memory usage for the token ring, gossip protocol bandwidth, and the complexity of range-ownership lookups. Each virtual node is a separate entry in the ring, and lookups require a binary search or sorted data structure to find the owning node. |
| Ring-Based vs Rendezvous Hashing | Consistent hashing on a ring gives O(log N) lookups with a sorted ring but requires maintaining ring state. Rendezvous hashing (HRW) has O(N) lookup per key (must score all nodes) but requires zero metadata -- just the list of node IDs. For small clusters (under 50 nodes), HRW is simpler; for large clusters, ring-based consistent hashing scales better. |
| Elasticity vs Data Movement Cost | While consistent hashing limits the number of keys moved during scaling events, the actual data transfer can still be substantial. Moving K/N keys in a 10TB dataset with 10 nodes means transferring approximately 1TB of data per scaling event. This transfer consumes network bandwidth, increases latency during migration, and requires careful orchestration to avoid impacting production traffic. |
Akamai: Consistent Hashing at CDN Scale
Scenario
In the late 1990s, web traffic was growing exponentially, and CDN cache servers were frequently added and removed to handle load spikes. With traditional mod-N hashing, adding a single cache server invalidated nearly every cached object across the entire fleet, causing a thundering herd of requests to hit origin servers simultaneously. During major events (elections, World Cup), this cache stampede would overwhelm origin infrastructure and cause outages for the content providers Akamai was trying to protect.
Solution
Akamai implemented consistent hashing (co-invented by its co-founder David Karger) to map cached URLs to server positions on a hash ring. When a server was added, only URLs mapping to the new server's ring segment were invalidated -- all other servers retained their cached content. Virtual nodes ensured that the load was evenly distributed even as the number of servers changed. The system also incorporated server health monitoring: unhealthy servers were removed from the ring, causing their URLs to fail over to the next healthy server clockwise, with minimal disruption to other server caches.
Outcome
Cache hit rates remained above 95% even during scaling events that added or removed 20% of the server fleet. Origin server load during scaling dropped by over 90% compared to the mod-N approach. Consistent hashing became the industry standard for CDN cache routing and was adopted by Cloudflare, Fastly, and every major CDN that followed. The technique directly enabled Akamai to scale to serving 15-30% of all global web traffic.
See Consistent Hashing in action
Explore system design templates that use consistent hashing and run traffic simulations to see how these concepts perform under real load.
Browse Templates1When a new node is added to a consistent hashing ring with K total keys and N existing nodes, approximately how many keys need to be redistributed?
2What problem do virtual nodes solve in consistent hashing?
3Which consistent hashing variant requires zero memory but only supports adding/removing the last node in sequence?