Skip to main content
Back to the DSA atlas
Strings Algorithm Intermediate

KMP (Knuth-Morris-Pratt)

Search for a pattern in linear time by reusing information about its own prefixes instead of restarting after mismatches.

stringpattern-matchingprefix-functionkmp

Interactive visualization

Use the controls to connect the idea to concrete operations before diving into the full write-up.

Step 1 / 6

Pattern and prefix table

a

lps = 0

b

lps = 0

a

lps = 1

b

lps = 2

d

lps = 0

Text scan

ababcabcabababd

Fallback without rescanning

Start matching from the beginning of the pattern.

Family

Strings

Prefix-aware structures and text-centric search strategies.

Builds on

Foundational

You can start here without another page first.

Unlocks

2 next topics

Use these follow-ups when you want to apply the current idea in a richer setting.

Learning paths

1

This topic appears in curated progressions so you can keep moving without guessing the next step.

Problem

You need to find a pattern inside a text.

The naive approach restarts from scratch after every mismatch, which can lead to repeated work. KMP avoids that waste.

Core idea

Precompute the longest proper prefix which is also a suffix for every pattern prefix.

This array is often called:

  • lps
  • prefix function
  • failure function

When a mismatch happens after matching j characters, KMP does not restart at j = 0 immediately. It jumps to lps[j - 1] and reuses what the pattern already knows about itself.

Why it works

Suppose you matched abab and then fail on the next character.

You already know the suffix ab is also a prefix of abab, so there is no reason to re-check those characters from scratch. KMP folds that fact into the prefix table.

Prefix table build

lps[0] <- 0
j <- 0

for i in 1..m-1:
    while j > 0 and pattern[i] != pattern[j]:
        j <- lps[j - 1]
    if pattern[i] == pattern[j]:
        j <- j + 1
    lps[i] <- j
j <- 0
for i in 0..n-1:
    while j > 0 and text[i] != pattern[j]:
        j <- lps[j - 1]
    if text[i] == pattern[j]:
        j <- j + 1
    if j == m:
        report match
        j <- lps[j - 1]

Complexity

PhaseTime
Build prefix tableO(m)O(m)
Search textO(n)O(n)

Overall: O(n+m)O(n + m).

Why it matters

KMP is a foundational string algorithm because it teaches a major pattern:

preprocess the pattern so mismatches become structured jumps instead of blind restarts

That same idea shows up again in Z-algorithm and Aho-Corasick.

Key takeaways

  • KMP turns repeated restarts into controlled prefix-table jumps
  • The lps array captures self-similarity inside the pattern
  • It is a classic linear-time single-pattern matcher
  • Even when you do not implement KMP directly, understanding it sharpens how you think about string reuse

Practice problems

ProblemDifficultyKey idea
Find the Index of the First Occurrence in a String🟡 MediumKMP is a linear-time solution
Repeated Substring Pattern🟡 MediumPrefix-function reasoning
Longest Happy Prefix🔴 HardDirect use of prefix table

Relation to other topics

  • Z-Algorithm solves a similar prefix-reuse problem with a different table
  • Aho-Corasick generalizes failure-style jumps from one pattern to many patterns

What this enables

Once the current idea feels natural, these are the most useful next jumps.

Related directions

These topics live nearby conceptually, even if they are not direct prerequisites.

More from Strings

Stay in the same family when you want parallel variations of the same mental model.

Paths that include this topic

Follow one of these sequences if you want a guided next step instead of open-ended browsing.

String matching

Move from prefix reuse and rolling hashes to suffix-based indexing and automata.

From the blog

Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.