What is important about Consistent Hashing regarding "Adding a node only moves keys from its immediate clockwise n..."?

Adding a node only moves keys from its immediate clockwise neighbor; removing a node only moves its keys to the next clockwise node. On average, K/N keys are redistributed per topology change, compared to K*(N-1)/N keys with mod-N hashing.

What is important about Consistent Hashing regarding "Virtual nodes assign each physical node 100-256 positions on..."?

Virtual nodes assign each physical node 100-256 positions on the ring, dramatically reducing distribution variance. Without vnodes, N nodes produce segment sizes with O(1/N) standard deviation; with V vnodes per node, the standard deviation drops to O(1/(V*N)).

What is important about Consistent Hashing regarding "Jump consistent hashing (Lamping & Veach, 2014) achieves per..."?

Jump consistent hashing (Lamping & Veach, 2014) achieves perfect distribution with zero memory overhead using a simple loop computation, but only supports adding/removing nodes at the end of the sequence -- it cannot handle arbitrary node failures.

What is important about Consistent Hashing regarding "Bounded-load consistent hashing (Google, 2017) caps each nod..."?

Bounded-load consistent hashing (Google, 2017) caps each node's maximum load at (1 + epsilon) times the average, redirecting overflow to subsequent ring positions. This provides a tunable balance between perfect uniformity and minimal key movement.

What is important about Consistent Hashing regarding "Rendezvous hashing (HRW) is an alternative that computes has..."?

Rendezvous hashing (HRW) is an alternative that computes hash(key, node_id) for all nodes and assigns the key to the highest-scoring node. It has O(N) lookup time per key but requires no ring metadata and handles node removal gracefully.

Vetora

🎯Partitioning

Consistent Hashing

Consistent hashing maps both keys and nodes onto a hash ring, assigning each key to the nearest node in the clockwise direction. When a node is added or removed, only K/N keys (where K is total keys and N is total nodes) need to be redistributed -- compared to nearly all keys with traditional mod-N hashing. This makes consistent hashing essential for distributed caches, CDNs, and databases that need elastic scaling.

Overview

Consistent hashing was introduced by David Karger et al. in their 1997 paper as a solution for distributing web cache content across a changing set of servers. The core insight is that both keys and nodes are mapped onto the same circular hash space (a ring from 0 to 2^m - 1), and each key is assigned to the first node encountered when walking clockwise from the key's position on the ring. When a node is added, it takes over a portion of the ring from its clockwise neighbor, and only the keys in that portion need to be moved. When a node is removed, its keys fall to the next clockwise node. In either case, only K/N keys are redistributed on average (where K is total keys and N is total nodes), compared to the nearly complete reshuffling that hash(key) mod N requires when N changes.

The naive implementation of consistent hashing has a significant problem: with a small number of physical nodes, the ring segments are likely to be uneven, causing some nodes to own far more keys than others. Virtual nodes (vnodes) solve this by assigning each physical node multiple positions on the ring -- typically 100 to 256 virtual nodes per physical node. Each virtual node is a point on the ring that maps back to its physical host. With enough virtual nodes, the statistical variation in segment sizes shrinks dramatically, producing near-uniform distribution. The Dynamo paper (2007) used 150 virtual nodes per physical node, and Cassandra defaults to 256.

Several important variants and alternatives have emerged since the original paper. Jump consistent hashing (Lamping and Veach, 2014 from Google) uses zero memory -- it computes the assigned bucket using only arithmetic, with perfect distribution across N buckets. However, it only supports appending or removing the last bucket, making it unsuitable for arbitrary node failures. Bounded-load consistent hashing (Google, 2017) addresses load imbalance by capping the maximum load any single node can handle at (1 + epsilon) times the average load, redirecting overflow to the next node on the ring. Rendezvous hashing (also called Highest Random Weight or HRW hashing) takes a different approach entirely: for each key, it computes a score for every node using hash(key, node_id), and assigns the key to the node with the highest score. This achieves similar redistribution properties to consistent hashing without needing a ring abstraction.

Consistent hashing is a foundational building block for modern distributed infrastructure. CDNs like Akamai (the company co-founded by Karger himself) use it to route requests to cache servers, minimizing cache misses when servers are added or removed. Distributed databases like DynamoDB and Cassandra use it for data partitioning. Load balancers use it for sticky session routing. Discord uses it to assign guilds (servers) to backend processes. In every case, the value proposition is the same: the ability to add and remove nodes from a cluster with minimal data movement, enabling elastic scaling without the thundering-herd cache-miss storms that plague mod-N approaches.

Key Points

1Keys and nodes are both hashed onto a circular ring (0 to 2^m - 1). Each key is assigned to the first node encountered when walking clockwise from the key's position. This is the core invariant that limits key redistribution when topology changes.
2Adding a node only moves keys from its immediate clockwise neighbor; removing a node only moves its keys to the next clockwise node. On average, K/N keys are redistributed per topology change, compared to K*(N-1)/N keys with mod-N hashing.
3Virtual nodes assign each physical node 100-256 positions on the ring, dramatically reducing distribution variance. Without vnodes, N nodes produce segment sizes with O(1/N) standard deviation; with V vnodes per node, the standard deviation drops to O(1/(V*N)).
4Jump consistent hashing (Lamping & Veach, 2014) achieves perfect distribution with zero memory overhead using a simple loop computation, but only supports adding/removing nodes at the end of the sequence -- it cannot handle arbitrary node failures.
5Bounded-load consistent hashing (Google, 2017) caps each node's maximum load at (1 + epsilon) times the average, redirecting overflow to subsequent ring positions. This provides a tunable balance between perfect uniformity and minimal key movement.
6Rendezvous hashing (HRW) is an alternative that computes hash(key, node_id) for all nodes and assigns the key to the highest-scoring node. It has O(N) lookup time per key but requires no ring metadata and handles node removal gracefully.

Simple Example

The Circular Parking Lot

Imagine a circular parking lot with spots numbered 0-359 (like degrees on a compass). Four attendants are stationed at positions 0, 90, 180, and 270. When a car arrives, the driver is given a ticket with a random number from 0-359. The driver parks at the first attendant's section they reach by driving clockwise from their number. Car #45 drives to the attendant at 90. Car #200 drives to the attendant at 270. Now a fifth attendant is added at position 135. Only cars parked between 90 and 135 need to move to the new attendant -- everyone else stays put. If attendant 90 goes on break, only their cars (parked between 0 and 90) move to attendant 135. The lot keeps working smoothly with minimal car shuffling.

Real-World Examples

Amazon DynamoDB (Dynamo Paper, 2007)

The original Dynamo system used consistent hashing with virtual nodes to partition data across a cluster of commodity servers. Each physical node was assigned 150 virtual nodes (tokens) on the hash ring. When a node joined the cluster, it received tokens from its ring neighbors, moving only the affected key ranges. Dynamo's 'preference list' extended consistent hashing by replicating each key to the next N distinct physical nodes on the ring (skipping virtual nodes belonging to the same physical host), ensuring durability even when adjacent ring positions mapped to the same machine.

Akamai CDN

Akamai was co-founded by David Karger, one of the authors of the original consistent hashing paper, specifically to apply this technique to content delivery. When a user requests a cached object, Akamai hashes the URL to a position on the ring and routes the request to the nearest cache server. When a server is added or removed (due to scaling or failure), only the objects that would map to the changed region of the ring are invalidated -- all other cached objects remain valid, preventing the cache stampede that would occur if all servers' caches were simultaneously invalidated.

Discord

Discord uses consistent hashing to assign guilds (servers, in Discord terminology) to backend gateway processes. Each guild is hashed to a position on the ring, and the owning gateway process handles all real-time events for that guild. When Discord scales its gateway fleet up or down, consistent hashing ensures that only a fraction of guilds are reassigned to new processes, minimizing the disruption of real-time voice and chat connections. For very large guilds (millions of members), Discord further shards within the guild using member-level consistent hashing.

Trade-Offs

Aspect	Description
Minimal Redistribution vs Metadata Overhead	Consistent hashing minimizes key movement during topology changes (K/N vs nearly all keys), but the ring metadata (token assignments, virtual node mappings) must be stored and distributed to all clients and nodes. In a 100-node cluster with 256 vnodes each, the ring contains 25,600 entries that must be synchronized via gossip or a coordination service.
Virtual Nodes: Uniformity vs Complexity	More virtual nodes produce more uniform distribution and faster rebalancing, but increase memory usage for the token ring, gossip protocol bandwidth, and the complexity of range-ownership lookups. Each virtual node is a separate entry in the ring, and lookups require a binary search or sorted data structure to find the owning node.
Ring-Based vs Rendezvous Hashing	Consistent hashing on a ring gives O(log N) lookups with a sorted ring but requires maintaining ring state. Rendezvous hashing (HRW) has O(N) lookup per key (must score all nodes) but requires zero metadata -- just the list of node IDs. For small clusters (under 50 nodes), HRW is simpler; for large clusters, ring-based consistent hashing scales better.
Elasticity vs Data Movement Cost	While consistent hashing limits the number of keys moved during scaling events, the actual data transfer can still be substantial. Moving K/N keys in a 10TB dataset with 10 nodes means transferring approximately 1TB of data per scaling event. This transfer consumes network bandwidth, increases latency during migration, and requires careful orchestration to avoid impacting production traffic.

Case Study

Akamai: Consistent Hashing at CDN Scale

Scenario

In the late 1990s, web traffic was growing exponentially, and CDN cache servers were frequently added and removed to handle load spikes. With traditional mod-N hashing, adding a single cache server invalidated nearly every cached object across the entire fleet, causing a thundering herd of requests to hit origin servers simultaneously. During major events (elections, World Cup), this cache stampede would overwhelm origin infrastructure and cause outages for the content providers Akamai was trying to protect.

Solution

Akamai implemented consistent hashing (co-invented by its co-founder David Karger) to map cached URLs to server positions on a hash ring. When a server was added, only URLs mapping to the new server's ring segment were invalidated -- all other servers retained their cached content. Virtual nodes ensured that the load was evenly distributed even as the number of servers changed. The system also incorporated server health monitoring: unhealthy servers were removed from the ring, causing their URLs to fail over to the next healthy server clockwise, with minimal disruption to other server caches.

Outcome

Cache hit rates remained above 95% even during scaling events that added or removed 20% of the server fleet. Origin server load during scaling dropped by over 90% compared to the mod-N approach. Consistent hashing became the industry standard for CDN cache routing and was adopted by Cloudflare, Fastly, and every major CDN that followed. The technique directly enabled Akamai to scale to serving 15-30% of all global web traffic.

Common Mistakes

⚠Using consistent hashing without virtual nodes. With only N physical positions on the ring, the variance in partition sizes is O(1/N), meaning some nodes will own 2-3x more data than others in a small cluster. Always use at least 100 virtual nodes per physical node for reasonable uniformity.
⚠Assuming consistent hashing eliminates all data movement. It minimizes movement to K/N keys per topology change, but K/N can still be a very large amount of data. Plan for migration bandwidth, use throttled background transfers, and monitor rebalancing progress.
⚠Choosing jump consistent hashing for systems with arbitrary node failures. Jump consistent hashing only supports adding or removing the last node in the sequence. If a node in the middle fails, the algorithm cannot handle it gracefully -- use ring-based consistent hashing or rendezvous hashing instead.
⚠Ignoring the hot-partition problem even with consistent hashing. Consistent hashing distributes keys uniformly, but it does not address the case where one key receives disproportionate traffic (the celebrity problem). Consistent hashing must be combined with caching, replication, or application-level fan-out to handle hot keys.

Related Concepts

Hash Partitioning Database Sharding Hotspot Mitigation Horizontal vs Vertical Scaling Cache-Aside

See Consistent Hashing in action

Explore system design templates that use consistent hashing and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Watch key redistribution when nodes join and leave a hash ring

Metrics to watch

keys_redistributed_pctpartition_size_variancecache_hit_rate

Run Simulation

Test Your Understanding

1When a new node is added to a consistent hashing ring with K total keys and N existing nodes, approximately how many keys need to be redistributed?

2What problem do virtual nodes solve in consistent hashing?

3Which consistent hashing variant requires zero memory but only supports adding/removing the last node in sequence?

Deeper Reading