B-Tree
Store many keys per node so ordered search stays shallow and expensive page reads stay low.
Interactive visualization
Use the controls to connect the idea to concrete operations before diving into the full write-up.
root [10 | 20]
left child [3 | 7 | 9]
middle child [12 | 15 | 18]
right child [22 | 25 | 30]
Visited nodes
Wide nodes beat tall trees on disk
Because each node holds many keys, the tree stays shallow and disk reads stay low.
Family
Systems & storage
Database indexes, cache policies, query operators, sharding, and external-memory algorithms.
Builds on
2 topics
Read these first if you want the surrounding context.
Unlocks
1 next topic
Use these follow-ups when you want to apply the current idea in a richer setting.
Learning paths
2
This topic appears in curated progressions so you can keep moving without guessing the next step.
Choose this over that
Choose the storage index that matches the workload
B-Trees minimize page depth, but practical engines often push farther toward range scans or write optimization.
Binary Search Tree (BST)
Choose this when: The structure is in memory and page-level storage costs are not the bottleneck.
Choose something else when: Disk or page reads dominate the cost model.
B-Tree
Choose this when: You want the core page-aware ordered-search structure and a clean bridge from classic trees to storage indexes.
Choose something else when: Range scans or write-heavy buffering dominate the design.
B+ Tree
Choose this when: Point lookups and especially ordered scans are the main indexed workload.
Choose something else when: You are learning the general page-aware tree idea first.
LSM Tree
Choose this when: Write throughput matters more than maintaining one eagerly updated on-disk tree.
Choose something else when: Predictable indexed reads and scans are the main priority.
Problem
A binary search tree is conceptually clean, but it is a bad storage-engine default when every node visit may cost a page read.
If one lookup walks through many tiny nodes, the tree spends more time waiting for storage than comparing keys.
Intuition
A B-Tree reduces height by storing many keys per node.
Each node corresponds to a page-sized block of sorted separator keys. Instead of branching by 2, the tree branches by a large factor B, so the height becomes roughly:
That is the whole design goal: trade more work inside one node for fewer page hops between nodes.
Structural rules
Using the standard minimum-degree notation t:
- every node stores between
t - 1and2t - 1keys, except the root - an internal node with
kkeys hask + 1children - every non-root internal node has between
tand2tchildren - keys inside a node are sorted
- all leaves appear at the same depth
Those occupancy rules keep the tree balanced while preserving high fanout.
Search
Search inside the current node, then descend into the child interval that contains the target key.
search(node, key):
find smallest i such that key <= node.keys[i]
if i exists and node.keys[i] == key:
return FOUND
if node is leaf:
return NOT_FOUND
return search(node.child[i], key)
In practice, the search inside one node is often a binary search over that node’s sorted key array.
Insertion
The clean insertion strategy is: never descend into a full child.
Split a full child
split_child(parent, i):
y <- parent.child[i]
z <- new node
move y.keys[t .. 2t-2] into z
median <- y.keys[t-1]
keep y.keys[0 .. t-2]
if y is not leaf:
move y.children[t .. 2t-1] into z
insert z as parent's child after y
insert median into parent.keys at position i
Insert into a non-full node
insert_non_full(node, key):
if node is leaf:
insert key into node.keys in sorted order
return
choose child i that should receive key
if child i is full:
split_child(node, i)
if key > node.keys[i]:
i <- i + 1
insert_non_full(node.child[i], key)
Full insertion
insert(tree, key):
if root is full:
newRoot <- new internal node
newRoot.child[0] <- old root
split_child(newRoot, 0)
root <- newRoot
insert_non_full(root, key)
The split pushes a separator upward and keeps the tree balanced automatically.
Deletion
Deletion is harder because nodes are not allowed to become too empty.
High-level cases:
- Delete from a leaf if the node still has enough keys afterward.
- Delete from an internal node by replacing the key with its predecessor or successor if a child can spare one.
- Borrow from a sibling if a target child is too small.
- Merge siblings if neither sibling can spare a key, then continue recursively.
The invariant is the mirror image of insertion: when descending, try not to enter a child that is already at the minimum occupancy.
Worked example
Suppose one page can hold about 200 keys.
Then one root page can branch to roughly 201 children. Two more levels below it cover about:
leaf ranges.
That is why B-Trees are such a natural storage-engine structure: a tiny height can cover a huge ordered dataset.
Complexity
| Operation | Time |
|---|---|
| Search | O(log_B n) page levels |
| Insert | O(log_B n) |
| Delete | O(log_B n) |
| Space | O(n) |
Within each page there is also a small search over the node’s key array, but the dominant cost in storage systems is usually page depth.
B-Tree vs. BST vs. B+ Tree
| Structure | Fanout | Best for |
|---|---|---|
| Binary search tree | 2 | In-memory ordered search |
| B-Tree | Large | Page-efficient ordered search |
| B+ Tree | Large, leaf-linked | Database indexes and range scans |
B-Trees are the conceptual bridge from classic search trees to real storage-engine indexes.
Key takeaways
- B-Trees exist to minimize page depth, not just comparison count.
- The defining operations are split on overflow and borrow/merge on underflow.
- High fanout is what makes the structure shallow enough for storage systems.
- Search, insert, and delete all stay logarithmic in the number of page levels.
- B+ Trees take the same page-aware idea and optimize it further for range scans and practical indexing.
Practice problems
| Problem | Difficulty | Key idea |
|---|---|---|
| B-Tree overview | Hard | Page-sized balanced search tree |
| PostgreSQL B-tree docs | Hard | Ordered indexes in a real engine |
| CLRS B-tree chapter notes | Hard | Split, insert, and delete invariants |
Relation to other topics
- Binary Search explains why sorted separators are enough to route a search.
- Binary Search Tree is the narrow-fanout in-memory ancestor of the same ordered-search idea.
- B+ Tree is the database-optimized variant most engines prefer for indexed scans.
- External Merge Sort sits nearby because both topics are shaped by page and disk costs.
Build on these first
These topics supply the mental model or underlying structure that this page assumes.
Tree Fundamentals
Understand roots, parents, children, depth, height, and traversal orders. A tree is a graph with enough structure to make many problems much simpler.
Binary Search
Halve the search space at every step. Binary search is not just for sorted arrays — it's a general technique for any monotonic predicate.
What this enables
Once the current idea feels natural, these are the most useful next jumps.
Related directions
These topics live nearby conceptually, even if they are not direct prerequisites.
Binary Search Tree (BST)
A sorted tree structure enabling O(log n) search, insert, and delete. Foundation for balanced trees, sets, and maps.
B+ Tree
Keep routing keys in internal nodes and all records in linked leaves so point lookups stay shallow and range scans stay fast.
External Merge Sort
Sort data larger than RAM by generating sorted runs and merging them with mostly sequential disk I/O.
LSM Tree
Trade read complexity for write throughput by buffering writes in memory, flushing sorted runs, and compacting them in the background.
More from Systems & storage
Stay in the same family when you want parallel variations of the same mental model.
B+ Tree
Keep routing keys in internal nodes and all records in linked leaves so point lookups stay shallow and range scans stay fast.
Cache Eviction Strategies (LRU, LFU, ARC, TinyLFU)
Treat cache eviction as an algorithm choice: recency, frequency, adaptation, and admission all optimize different workloads.
Consistent Hashing
Map keys onto a ring so cluster membership changes move only nearby keys instead of reshuffling the whole keyspace.
External Merge Sort
Sort data larger than RAM by generating sorted runs and merging them with mostly sequential disk I/O.
Paths that include this topic
Follow one of these sequences if you want a guided next step instead of open-ended browsing.
Ordered structures
See how order can come from sorted arrays, search trees, skip lists, disk-friendly indexes, heaps, and prefix trees.
Storage engines
Compare fanout-heavy indexes, write-optimized trees, and external-memory sorting in one storage-oriented path.
From the blog
Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.