Skip to main content
Back to the DSA atlas
Strings Algorithm Advanced

Aho-Corasick

Match many patterns in one pass by augmenting a trie with failure links. This is multi-pattern search done at scale.

stringtriepattern-matchingrare

Interactive visualization

Use the controls to connect the idea to concrete operations before diving into the full write-up.

Reading index 0

Patterns stored in one trie with failure links

he
she
his
hers

Scan text once

ushers

Matches found so far

No completed pattern yet.

Many patterns, one pass

Aho-Corasick starts from a trie, then adds failure links so mismatches can fall back without restarting from scratch. That lets one scan of the text report every pattern occurrence.

Family

Strings

Prefix-aware structures and text-centric search strategies.

Builds on

1 topic

Read these first if you want the surrounding context.

Unlocks

0 next topics

Use these follow-ups when you want to apply the current idea in a richer setting.

Learning paths

1

This topic appears in curated progressions so you can keep moving without guessing the next step.

Problem

You have many patterns and one big text.

Examples:

  • a blocklist of forbidden words
  • many DNA motifs
  • many keywords to detect in a document stream

Running KMP or a naive scan for every pattern separately is too expensive. You want one pass over the text to find all matches.

Core idea

Start with a trie containing every pattern.

Then add failure links:

  • if the current edge does not exist
  • jump to the longest proper suffix that is also a trie prefix
  • continue from there instead of restarting from the root manually

This makes the trie behave like a finite automaton for many strings at once.

Why it is powerful

Each character of the text is processed once through a sequence of trie moves and occasional failure jumps. The fallback work is amortized, so the whole scan stays linear in the text length plus the number of matches.

That is the big win:

many patterns, one automaton, one scan

Construction outline

  1. insert every pattern into a trie
  2. perform a BFS over trie nodes
  3. compute a failure link for each node
  4. propagate output information so suffix matches are also reported

The BFS step is what turns the trie into the final matcher.

Complexity

If the total pattern length is MM and the text length is NN:

PhaseTime
Build trie + failure linksO(M)O(M) or O(Malphabet)O(M \cdot alphabet) depending on implementation
Scan textO(N+matches)O(N + matches)

When to use it

Use Aho-Corasick when:

  • there are many patterns
  • the text is large or streamed
  • you need every occurrence, not just one pattern at a time

It is uncommon in everyday interview prep, but very important in real systems like filters, lexers, intrusion detection, and bioinformatics pipelines.

Key takeaways

  • Aho-Corasick is a trie plus failure links
  • It is the standard answer to multi-pattern matching
  • BFS is part of the construction, which is a nice cross-family connection: string matching built with trie structure and graph-style traversal
  • This is rare for beginner curricula, but it belongs in a serious string toolbox

Practice problems

ProblemDifficultyKey idea
Stream of Characters🔴 HardTrie-style streaming queries, close in spirit to Aho-Corasick
Aho-Corasick overview🔴 HardBuild failure links and output transitions
Keyword Search references🔴 HardMulti-pattern text scanning

Relation to other topics

  • Trie is the base structure that stores all patterns together
  • BFS often appears in the construction phase for failure links
  • This is the natural “what comes next” topic after someone understands tries well

Build on these first

These topics supply the mental model or underlying structure that this page assumes.

Related directions

These topics live nearby conceptually, even if they are not direct prerequisites.

More from Strings

Stay in the same family when you want parallel variations of the same mental model.

Paths that include this topic

Follow one of these sequences if you want a guided next step instead of open-ended browsing.

String matching

Move from prefix reuse and rolling hashes to suffix-based indexing and automata.

From the blog

Pair the atlas with recent writing from the rest of the site when you want a broader engineering perspective alongside the topic graph.