Skip to main content
Back to the DSA atlas
Systems & storage Technique Advanced

Consistent Hashing

Map keys onto a ring so cluster membership changes move only nearby keys instead of reshuffling the whole keyspace.

distributedhashingcacherare

Interactive visualization

Use the controls to connect the idea to concrete operations before diving into the full write-up.

user:17

server A

post:88

server B

feed:9

server C

Only keys near the new server's position on the hash ring move. Everyone else stays where they were.

Stable sharding under cluster changes

Consistent hashing avoids full remapping when servers are added or removed. That is why it is so common in distributed caches and partitioned key-value systems.

Family

Systems & storage

Database indexes, cache policies, query operators, sharding, and external-memory algorithms.

Builds on

1 topic

Read these first if you want the surrounding context.

Unlocks

0 next topics

Use these follow-ups when you want to apply the current idea in a richer setting.

Learning paths

1

This topic appears in curated progressions so you can keep moving without guessing the next step.

Choose this over that

Separate distributed placement from local cache policy

Consistent hashing decides which machine owns the key. It is not a replacement for the local data structure or eviction policy.

Lookup & hashing Open topic

Hash Map

Choose this when: The question is local key lookup inside one process or one machine.

Choose something else when: The mapping must remain stable while the cluster changes.

Current topic

Consistent Hashing

Choose this when: Adding or removing machines should move only nearby keys instead of reshuffling the whole keyspace.

Choose something else when: The problem is local eviction or local membership, not distributed ownership.

Systems & storage Open topic

Cache Eviction Strategies (LRU, LFU, ARC, TinyLFU)

Choose this when: The system has already chosen the owning machine and now needs to decide what survives in memory on that machine.

Choose something else when: The real problem is placement across the cluster.

Problem

If you place keys with hash(key) % n, changing n remaps almost everything.

That is fine for one process, but terrible for distributed caches and sharded storage. Adding one server should not invalidate the entire cluster’s placement map.

Intuition

Consistent hashing replaces the linear bucket range with a ring.

  • hash every server onto the ring
  • hash every key onto the same ring
  • assign the key to the first server encountered while walking clockwise

Now a server owns only the interval immediately before its ring position. When a new server appears, it steals only that interval instead of forcing a global reshuffle.

Data structure model

A typical implementation stores the server positions in a sorted structure:

  • sorted array + binary search
  • balanced tree / ordered map
  • skip list or B-tree-like ordered index

Real systems also use virtual nodes: each physical server is hashed to many ring positions.

That smooths imbalance because a server no longer owns one giant interval. It owns many smaller intervals scattered around the ring.

Core operations

Lookup

owner(key):
    pos <- hash(key)
    i <- lower_bound(ringPositions, pos)
    if i exists:
        return ringPositions[i].server
    return ringPositions[0].server    // wrap around the ring

The only algorithmic requirement is an ordered search for the first server clockwise from the key.

Add a server

add_server(server, replicas):
    for r in 1 .. replicas:
        pos <- hash(server.id || "#" || r)
        insert (pos, server) into ordered ring

Each inserted virtual node steals only the interval between its predecessor and itself.

Remove a server

remove_server(server):
    remove all virtual-node positions belonging to server

The keys in those intervals fall through to the next clockwise server.

Worked example

Assume the ring positions are:

  • A at 20
  • B at 60
  • C at 85

Then the ownership intervals are:

  • (85, 20] -> A
  • (20, 60] -> B
  • (60, 85] -> C

So:

  • key hash 10 -> A
  • key hash 42 -> B
  • key hash 74 -> C
  • key hash 91 -> wrap -> A

Now add D at 70.

Only the interval (60, 70] moves from C to D. Keys elsewhere stay put. That local movement is the entire point of the technique.

Why virtual nodes matter

Without virtual nodes, one unlucky server hash can own a huge fraction of the ring.

With R virtual nodes per machine:

  • load balances more evenly
  • adding one physical machine introduces many small ownership changes instead of one large cliff
  • replicas and heterogeneous capacity become easier to express

A bigger machine can simply receive more virtual nodes.

Complexity

Assume there are V virtual nodes total.

OperationTimeNotes
LookupO(log V)Binary search in a sorted ring
Add one physical server with R virtual nodesO(R log V)Insert R ordered positions
Remove one physical server with R virtual nodesO(R log V)Delete its ordered positions
Key remapping on topology changeproportional to affected intervalsOnly nearby keys move

The asymptotic lookup cost is not the main win. The real win is minimal remapping under membership change.

Consistent hashing vs. modulo hashing

SchemeLookup ruleWhat happens when a server changes?
hash(key) % nDirect modulo bucketNearly every key can move
Consistent hashingFirst clockwise server on a ringOnly keys in nearby intervals move

Modulo hashing is simpler. Consistent hashing is the right choice when stability under scaling events matters more than trivial implementation.

Key takeaways

  • Consistent hashing is about stable placement, not faster hashing.
  • The algorithm needs an ordered ring of server positions and a clockwise owner rule.
  • Virtual nodes are what make the technique robust in real systems.
  • Only a small region of keys moves when a node joins or leaves, which is why caches and sharded stores rely on it.
  • Cache eviction and consistent hashing solve different problems: one chooses the local victim, the other chooses the owning machine.

Practice problems

ProblemDifficultyKey idea
Consistent hashing overviewHardRing-based stable remapping
Dynamo paperHardPartitioning and virtual nodes in a production key-value store
Memcached client hashing notesHardWhy distributed caches care about remapping behavior

Relation to other topics

  • Hash Map gives the single-machine hashing intuition that consistent hashing generalizes to a cluster.
  • Cache Eviction Strategies decide what stays in one node’s cache once consistent hashing has placed the key there.
  • Bloom Filter & Cuckoo Filter are often paired with distributed placement to avoid useless downstream probes.

Build on these first

These topics supply the mental model or underlying structure that this page assumes.

Related directions

These topics live nearby conceptually, even if they are not direct prerequisites.

More from Systems & storage

Stay in the same family when you want parallel variations of the same mental model.

Paths that include this topic

Follow one of these sequences if you want a guided next step instead of open-ended browsing.

Cache & distribution

Connect local eviction policy with distributed key placement and system-scale cache design.

From the blog

Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.