Union-Find (Disjoint Set)
Track connected components efficiently with near-constant time union and find operations. The backbone of Kruskal's MST and dynamic connectivity.
Interactive visualization
Use the controls to connect the idea to concrete operations before diving into the full write-up.
Family
Graphs
Nodes, edges, traversal, connectivity, and shortest paths.
Builds on
1 topic
Read these first if you want the surrounding context.
Unlocks
2 next topics
Use these follow-ups when you want to apply the current idea in a richer setting.
Learning paths
3
This topic appears in curated progressions so you can keep moving without guessing the next step.
Problem
Maintain a collection of disjoint sets that support two operations:
- Find(x): which set does element belong to?
- Union(x, y): merge the sets containing and
Intuition
Think of it as a forest of trees. Each set is a tree, and the root of the tree is the representative of that set. Find(x) walks up from to the root. Union(x, y) finds both roots and makes one point to the other.
Without optimizations, this can degenerate to a linked list ( per find). Two optimizations make it nearly amortized:
- Path compression: during
Find, make every node on the path point directly to the root - Union by rank (or size): attach the shorter tree under the taller tree
With both optimizations, the amortized cost per operation is — the inverse Ackermann function — which is effectively constant for all practical input sizes ( for ).
Algorithm
parent ← [0, 1, 2, ..., n-1] // each node is its own parent
rank ← [0, 0, 0, ..., 0]
Find(x):
if parent[x] ≠ x:
parent[x] ← Find(parent[x]) // path compression
return parent[x]
Union(x, y):
rx ← Find(x)
ry ← Find(y)
if rx == ry: return // already in same set
// union by rank
if rank[rx] < rank[ry]:
parent[rx] ← ry
else if rank[rx] > rank[ry]:
parent[ry] ← rx
else:
parent[ry] ← rx
rank[rx] += 1
Path compression in detail
Without path compression, Find walks the chain: . Path compression flattens this by making every node on the path point directly to the root. The next Find on any of these nodes is .
Find(5): 5 → 3 → 1 → 0 (root)
After: 5 → 0, 3 → 0, 1 → 0
Union by rank in detail
Rank is an upper bound on tree height. When merging two trees, attach the shorter one under the taller one. This keeps the tree shallow. If both trees have the same rank, pick either as root and increment its rank by 1.
An alternative is union by size — attach the smaller set under the larger one. Both give the same asymptotic bound.
Cycle detection in undirected graphs
To check if adding edge creates a cycle: if Find(u) == Find(v), then and are already connected, so adding the edge creates a cycle. Otherwise, Union(u, v).
for each edge (u, v) in graph:
if Find(u) == Find(v):
"cycle detected"
else:
Union(u, v)
This is simpler than DFS-based cycle detection for undirected graphs and runs in nearly time.
Kruskal’s MST
Union-Find is the core data structure in Kruskal’s Minimum Spanning Tree algorithm:
- Sort all edges by weight
- For each edge in sorted order:
- If
Find(u) ≠ Find(v): add edge to MST,Union(u, v) - Otherwise: skip (would create a cycle)
- If
- Stop when MST has edges
The sorting is , and each union/find is , so total is .
Counting connected components
Initialize with components (one per node). Each successful Union decrements the count by 1. At the end, the count tells you how many connected components exist.
components ← n
for each edge (u, v):
if Find(u) ≠ Find(v):
Union(u, v)
components -= 1
When to use Union-Find vs DFS/BFS
| Problem | Union-Find | DFS/BFS |
|---|---|---|
| Static connectivity (all edges known) | Works, but overkill | Simple traversal suffices |
| Dynamic connectivity (edges added over time) | Best choice | Must re-traverse |
| Cycle detection (undirected) | Clean and fast | Also works (track parent) |
| Kruskal’s MST | Required | N/A |
| Shortest path | Not applicable | Use BFS/Dijkstra |
Union-Find shines when edges arrive incrementally and you need to answer “are these connected?” queries along the way. DFS/BFS is better for one-shot traversal problems.
Complexity
| Operation | Time (amortized) | Space |
|---|---|---|
| Find | ≈ | — |
| Union | ≈ | — |
| Total space | — |
Where is the inverse Ackermann function. For all practical purposes, this is constant.
Key takeaways
- Always use both path compression and union by rank — together they give amortized per operation, which is effectively constant.
- Union-Find excels at incremental connectivity — when edges arrive over time and you need on-the-fly “are these connected?” queries, it beats DFS/BFS.
- Cycle detection is trivial — if
Find(u) == Find(v)before adding edge , a cycle exists. - Count components by tracking successful unions — start with components and decrement on each union of distinct sets.
- Think Union-Find whenever you see equivalence classes — grouping accounts, merging intervals, or partitioning elements by some equivalence relation.
Practice problems
| Problem | Difficulty | Key idea |
|---|---|---|
| Number of Provinces | 🟡 Medium | Count connected components by union-ing adjacent nodes |
| Redundant Connection | 🟡 Medium | Find the edge that creates a cycle using union-find |
| Accounts Merge | 🟡 Medium | Union accounts sharing an email, then group by root |
| Most Stones Removed with Same Row or Column | 🟡 Medium | Stones in the same row/column form a connected component |
| Satisfiability of Equality Equations | 🟡 Medium | Union variables in equality constraints, then check inequality constraints |
Relation to other topics
- Kruskal’s MST — Union-Find is the core data structure that makes Kruskal’s algorithm efficient by quickly testing whether an edge connects two different components.
- Graph traversal (DFS/BFS) — for static graphs where all edges are known upfront, DFS/BFS is simpler; Union-Find wins when edges arrive incrementally.
- Connected components — Union-Find and BFS/DFS both solve this, but Union-Find supports dynamic edge insertion without re-traversal.
Build on these first
These topics supply the mental model or underlying structure that this page assumes.
What this enables
Once the current idea feels natural, these are the most useful next jumps.
Minimum Spanning Tree
Find the subset of edges that connects all vertices with minimum total weight. Kruskal's and Prim's are the two classic approaches.
DSU Rollback
Add an undo stack to Union-Find so merges can be reversed during offline divide-and-conquer or time-travel style processing.
Related directions
These topics live nearby conceptually, even if they are not direct prerequisites.
Minimum Spanning Tree
Find the subset of edges that connects all vertices with minimum total weight. Kruskal's and Prim's are the two classic approaches.
Bridges and Articulation Points
Use DFS timestamps and low-link values to find the edges and vertices whose removal disconnects an undirected graph.
DSU Rollback
Add an undo stack to Union-Find so merges can be reversed during offline divide-and-conquer or time-travel style processing.
Tarjan SCC
Find strongly connected components in one DFS by tracking discovery times, low-link values, and the active stack.
More from Graphs
Stay in the same family when you want parallel variations of the same mental model.
BFS — Breadth-First Search
Explore a graph level by level using a queue. BFS finds the shortest path in unweighted graphs and is fundamental to many graph algorithms.
DFS — Depth-First Search
Dive as deep as possible before backtracking. DFS is essential for detecting cycles, topological sorting, and exploring all paths in a graph.
Graph Fundamentals
Learn the core language of nodes, edges, paths, and cycles. Graphs are the broad container that trees, BFS, DFS, Dijkstra, and MST all live inside.
0-1 BFS
Find shortest paths in graphs whose edge weights are only 0 or 1 using a deque instead of a heap.
Paths that include this topic
Follow one of these sequences if you want a guided next step instead of open-ended browsing.
Graph toolkit
Start with graph shape, then layer traversal, connectivity structure, and both single-source and all-pairs path algorithms.
Graph structure
Go beyond traversal into SCCs, critical edges, degree structure, and other non-shortest-path graph properties.
Offline toolkit
See how reordering queries, rolling back state, and preserving versions can replace online complexity.
From the blog
Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.