Count-Min Sketch
Estimate item frequencies in a stream with a tiny counter matrix, a few hash functions, and a guaranteed one-sided error.
Probabilistic systems is nested under Systems & storage , so techniques in the broader family usually still apply here.
Interactive visualization
Use the controls to connect the idea to concrete operations before diving into the full write-up.
hash row 0
bucket 2
5
hash row 1
bucket 5
3
hash row 2
bucket 1
4
Approximate frequency from multiple hashed counters
Each item increments one counter per row. Collisions can only overcount, so taking the minimum counter gives the tightest estimate available.
Family
Systems & storage → Probabilistic systems
Approximate membership, cardinality, and frequency structures for high-volume systems.
Builds on
1 topic
Read these first if you want the surrounding context.
Unlocks
0 next topics
Use these follow-ups when you want to apply the current idea in a richer setting.
Learning paths
1
This topic appears in curated progressions so you can keep moving without guessing the next step.
Choose this over that
Frequency estimation is a different sketch problem
Count-Min Sketch is the "how often?" sketch, not the "have I seen it?" or "how many unique?" sketch.
Bloom Filter & Cuckoo Filter
Choose this when: The system only needs approximate membership tests in front of a slower lookup.
Choose something else when: Admission, heavy hitters, or frequency ranking matter more than membership.
Count-Min Sketch
Choose this when: You need compact per-key frequency estimates and can tolerate one-sided overestimation.
Choose something else when: Exact counts or distinct counting are the real goal.
HyperLogLog
Choose this when: The only question is how many unique keys were seen, not how many times each key appeared.
Choose something else when: Individual key frequency needs to survive the sketch.
Problem
An exact frequency table stores one counter per distinct key. That is perfect when the key space is modest, but it breaks down for huge streams, telemetry firehoses, or caches trying to remember millions of short-lived candidates.
Sometimes you do not need exact counts. You need a tiny structure that can answer:
is this item probably frequent enough to care about?
Intuition
A Count-Min Sketch is a grid of counters with d rows and w columns.
Each row has its own hash function. When an item arrives, you increment one counter per row. To query a key, you hash it into the same d positions and take the minimum of those counters.
Why the minimum?
Because collisions only add extra mass. They can push a counter upward, but they can never subtract from the true count. The minimum is the least polluted estimate among the rows.
Core operations
Update
update(x, delta = 1):
for r in 1 .. d:
c <- h_r(x) mod w
table[r][c] <- table[r][c] + delta
Query
estimate(x):
ans <- +infinity
for r in 1 .. d:
c <- h_r(x) mod w
ans <- min(ans, table[r][c])
return ans
Merge
If two sketches use the same dimensions and hash functions, they merge by pointwise addition.
merge(A, B):
for r in 1 .. d:
for c in 1 .. w:
A[r][c] <- A[r][c] + B[r][c]
That mergeability is one reason sketches are so useful in distributed systems.
Worked example
Suppose d = 3, w = 8, and the stream is:
login, search, login, view, login, search
After processing the stream, the three rows may contain different collisions, but every row position touched by login has been incremented three times.
If one of those cells also got bumped by unrelated collisions, its value might be 4 or 5. The query takes the minimum over the three candidate cells, so it still returns something close to the true frequency.
That is the pattern to remember:
- collisions add noise
- noise is one-sided
- minimum counters reduce the damage
Error bounds and parameter choice
For total stream weight N, a Count-Min Sketch can be sized so that:
- estimated count is never below the true count
- overestimation is at most
epsilon * Nwith probability at least1 - delta
A standard choice is:
w = ceil(e / epsilon)d = ceil(ln(1 / delta))
So:
- bigger
wreduces collision error - bigger
dreduces the probability that every row is badly polluted
Complexity
| Operation | Time | Space |
|---|---|---|
| Update | O(d) | O(d * w) |
| Query | O(d) | O(d * w) |
| Merge | O(d * w) | O(d * w) |
The memory depends only on the sketch dimensions, not on the number of distinct keys observed.
When to choose Count-Min Sketch
| Structure | Answers | Error shape | Best for |
|---|---|---|---|
| Hash map | Exact frequency | None | Small or moderate key spaces |
| Count-Min Sketch | Approximate frequency | Overestimates only | Huge streams, hot-key detection, admission heuristics |
| Bloom filter | Approximate membership | False positives only | ”Seen or not seen?” checks |
| HyperLogLog | Approximate distinct count | Cardinality error | ”How many unique keys?” questions |
Key takeaways
- Count-Min Sketch estimates frequency, not membership and not cardinality.
- The estimate is biased upward because collisions can only add counts, never remove them.
- The minimum across rows is the core trick that makes the sketch useful.
wcontrols the size of collision error;dcontrols the probability of a bad estimate.- TinyLFU-style cache admission is a perfect example of why an approximate counter can still drive a real algorithm well.
Practice problems
| Problem | Difficulty | Key idea |
|---|---|---|
| Count-Min Sketch overview | Hard | Multiple hashed counter rows with one-sided error |
| Mining of Massive Datasets notes | Hard | Frequency sketches and streaming heavy hitters |
| TinyLFU paper | Hard | Frequency estimation inside cache admission |
Relation to other topics
- Bloom Filter & Cuckoo Filter answer approximate membership instead of approximate frequency.
- HyperLogLog estimates distinct count instead of per-item count.
- Cache Eviction Strategies use sketch-style counts in TinyLFU-like admission logic.
Build on these first
These topics supply the mental model or underlying structure that this page assumes.
Related directions
These topics live nearby conceptually, even if they are not direct prerequisites.
Cache Eviction Strategies (LRU, LFU, ARC, TinyLFU)
Treat cache eviction as an algorithm choice: recency, frequency, adaptation, and admission all optimize different workloads.
Bloom Filter & Cuckoo Filter
Use tiny approximate membership structures to reject absent keys cheaply before touching slower storage or larger indexes.
HyperLogLog
Estimate the number of distinct items in a huge stream with tiny registers and the statistics of rare hash patterns.
More from Probabilistic systems
Stay in the same family when you want parallel variations of the same mental model.
Bloom Filter & Cuckoo Filter
Use tiny approximate membership structures to reject absent keys cheaply before touching slower storage or larger indexes.
HyperLogLog
Estimate the number of distinct items in a huge stream with tiny registers and the statistics of rare hash patterns.
Paths that include this topic
Follow one of these sequences if you want a guided next step instead of open-ended browsing.
Probabilistic systems
Learn how approximate membership, frequency, and cardinality structures trade tiny errors for massive scale.
From the blog
Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.