KMP (Knuth-Morris-Pratt)
Search for a pattern in linear time by reusing information about its own prefixes instead of restarting after mismatches.
Interactive visualization
Use the controls to connect the idea to concrete operations before diving into the full write-up.
Step 1 / 6
Pattern and prefix table
a
lps = 0
b
lps = 0
a
lps = 1
b
lps = 2
d
lps = 0
Text scan
Fallback without rescanning
Start matching from the beginning of the pattern.
Family
Strings
Prefix-aware structures and text-centric search strategies.
Builds on
Foundational
You can start here without another page first.
Unlocks
2 next topics
Use these follow-ups when you want to apply the current idea in a richer setting.
Learning paths
1
This topic appears in curated progressions so you can keep moving without guessing the next step.
Problem
You need to find a pattern inside a text.
The naive approach restarts from scratch after every mismatch, which can lead to repeated work. KMP avoids that waste.
Core idea
Precompute the longest proper prefix which is also a suffix for every pattern prefix.
This array is often called:
lps- prefix function
- failure function
When a mismatch happens after matching j characters, KMP does not restart at j = 0 immediately. It jumps to lps[j - 1] and reuses what the pattern already knows about itself.
Why it works
Suppose you matched abab and then fail on the next character.
You already know the suffix ab is also a prefix of abab, so there is no reason to re-check those characters from scratch. KMP folds that fact into the prefix table.
Prefix table build
lps[0] <- 0
j <- 0
for i in 1..m-1:
while j > 0 and pattern[i] != pattern[j]:
j <- lps[j - 1]
if pattern[i] == pattern[j]:
j <- j + 1
lps[i] <- j
Search
j <- 0
for i in 0..n-1:
while j > 0 and text[i] != pattern[j]:
j <- lps[j - 1]
if text[i] == pattern[j]:
j <- j + 1
if j == m:
report match
j <- lps[j - 1]
Complexity
| Phase | Time |
|---|---|
| Build prefix table | |
| Search text |
Overall: .
Why it matters
KMP is a foundational string algorithm because it teaches a major pattern:
preprocess the pattern so mismatches become structured jumps instead of blind restarts
That same idea shows up again in Z-algorithm and Aho-Corasick.
Key takeaways
- KMP turns repeated restarts into controlled prefix-table jumps
- The
lpsarray captures self-similarity inside the pattern - It is a classic linear-time single-pattern matcher
- Even when you do not implement KMP directly, understanding it sharpens how you think about string reuse
Practice problems
| Problem | Difficulty | Key idea |
|---|---|---|
| Find the Index of the First Occurrence in a String | 🟡 Medium | KMP is a linear-time solution |
| Repeated Substring Pattern | 🟡 Medium | Prefix-function reasoning |
| Longest Happy Prefix | 🔴 Hard | Direct use of prefix table |
Relation to other topics
- Z-Algorithm solves a similar prefix-reuse problem with a different table
- Aho-Corasick generalizes failure-style jumps from one pattern to many patterns
What this enables
Once the current idea feels natural, these are the most useful next jumps.
Related directions
These topics live nearby conceptually, even if they are not direct prerequisites.
Rabin-Karp
Search with rolling hashes so most windows can be rejected with arithmetic instead of full character comparisons.
Z-Algorithm
Compute how much of the global prefix matches at every position, then reuse the active Z-box to stay linear.
Aho-Corasick
Match many patterns in one pass by augmenting a trie with failure links. This is multi-pattern search done at scale.
Suffix Array
Sort all suffixes of a string once, then answer many substring questions by binary searching the sorted order.
More from Strings
Stay in the same family when you want parallel variations of the same mental model.
Rabin-Karp
Search with rolling hashes so most windows can be rejected with arithmetic instead of full character comparisons.
Trie (Prefix Tree)
A tree-like data structure for efficient string storage and prefix-based retrieval. Essential for autocomplete, spell checking, and word search problems.
Z-Algorithm
Compute how much of the global prefix matches at every position, then reuse the active Z-box to stay linear.
Aho-Corasick
Match many patterns in one pass by augmenting a trie with failure links. This is multi-pattern search done at scale.
Paths that include this topic
Follow one of these sequences if you want a guided next step instead of open-ended browsing.
String matching
Move from prefix reuse and rolling hashes to suffix-based indexing and automata.
From the blog
Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.