System design

System Design Atlas

Learn production systems by mechanism and trade-off, not by memorizing one canned answer per interview prompt.

Deep dives

Production-focused write-ups that stay concrete about APIs, storage, and failure handling.

Families

Follow topics by system concern instead of treating every interview design as an isolated whiteboard prompt.

End-to-end designs

These pages optimize for production architecture, not just naming the right buzzwords in an interview.

Learning paths

Curated sequences keep the next deep technical topic obvious as the atlas expands.

Navigate the system design atlas

Search by mechanism, focus by family, and keep the current slice in the URL so it is easy to reopen the same learning thread later.

Browse by family

Topic type

Difficulty

Curated paths

System design topics

10 matching topics .

Traffic management Foundation Advanced

Token Bucket, GCRA, and Virtual Time

Understand token-based rate limiting mathematically: saturated integrators, debt-space duals, and why token bucket and GCRA are the same policy in different coordinates.

Traffic management lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Read now

Self-contained enough to open without another page first.

Learning paths

2 curated paths currently include this deep dive.

Unlocks

Designing a Rate Limiter (at Scale, Production-Grade)

token-bucketgcravirtual-timerate-limitingintegratortraffic-shaping

Traffic management End-to-end design Advanced

Designing a Rate Limiter (at Scale, Production-Grade)

Design a limiter that is actually deployable: low-latency enforcement, burst handling, distributed quotas, multi-region coordination, and failure-safe behavior.

Traffic management lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Read now

Self-contained enough to open without another page first.

Learning paths

3 curated paths currently include this deep dive.

Unlocks

Global Quotas (Hierarchical Budgets Across Regions and Fleets), Load Shedding (Protecting Latency Under Saturation), Feedback Control for Autoscaling and Load Shedding

rate-limitingtoken-bucketsliding-windowredismulti-regioncontrol-plane

Traffic management End-to-end design Advanced

Global Quotas (Hierarchical Budgets Across Regions and Fleets)

Design worldwide quotas without putting a globally serialized dependency in the request path, using hierarchical allocation, leased budgets, and bounded overshoot.

Traffic management lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Builds on

Designing a Rate Limiter (at Scale, Production-Grade), Distributed Locking (Leases, Fencing Tokens, and When Not to Use It)

Learning paths

3 curated paths currently include this deep dive.

global-quotasbudget-allocationmulti-regionhierarchical-limitsleasesfairness

Traffic management End-to-end design Advanced

Load Shedding (Protecting Latency Under Saturation)

Design admission control that drops the right work at the right time, using concurrency, queue depth, cost, and priority instead of letting the service fail slowly.

Traffic management lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Builds on

Circuit Breakers (State Machines, Hysteresis, and Fast Failure)

Learning paths

2 curated paths currently include this deep dive.

load-sheddingadmission-controlconcurrencybrownoutoverloadlatency

Control plane End-to-end design Advanced

Feature Flags Control Plane (Versioning, Distribution, and Safe Rollouts)

Design a feature flag platform that supports low-latency local evaluation, strong auditability, deterministic targeting, and safe configuration rollouts across a fleet.

Control plane lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Read now

Self-contained enough to open without another page first.

Learning paths

1 curated path currently include this deep dive.

Unlocks

Global Quotas (Hierarchical Budgets Across Regions and Fleets)

feature-flagscontrol-planerolloutsxdstargetingconfiguration

Control plane Trade-off Advanced

Distributed Locking (Leases, Fencing Tokens, and When Not to Use It)

Design distributed locking with explicit guarantees, stale-owner protection, and realistic failure semantics instead of assuming a lock magically creates correctness.

Control plane lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Read now

Self-contained enough to open without another page first.

Learning paths

1 curated path currently include this deep dive.

Unlocks

Global Quotas (Hierarchical Budgets Across Regions and Fleets)

distributed-lockingleasesfencingconsensusrediszookeeper

Reliability Building block Advanced

Circuit Breakers (State Machines, Hysteresis, and Fast Failure)

Design circuit breakers that actually stabilize a fleet: rolling windows, half-open probes, dependency-scoped state, and clean interaction with retries and load shedding.

Reliability lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Builds on

Idempotency and Retries (Without Multiplying Load)

Learning paths

1 curated path currently include this deep dive.

Unlocks

Load Shedding (Protecting Latency Under Saturation), Feedback Control for Autoscaling and Load Shedding

circuit-breakertimeoutshalf-openhysteresisresiliencedependency-isolation

Reliability Building block Advanced

Feedback Control for Autoscaling and Load Shedding

Use PI/PID ideas the way production systems actually do: filtered signals, clamped actions, weak predictive bias, and layered controllers instead of textbook loops.

Reliability lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Builds on

Designing a Rate Limiter (at Scale, Production-Grade)

Learning paths

2 curated paths currently include this deep dive.

Unlocks

Load Shedding (Protecting Latency Under Saturation), Anti-Windup, Hysteresis, and Oscillation in Distributed Control Loops

feedback-controlpidpiautoscalingload-sheddingewma

Reliability Building block Advanced

Idempotency and Retries (Without Multiplying Load)

Build a retry stack that survives crashes, duplicate delivery, and partial completion without turning transient failure into write amplification and data corruption.

Reliability lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Read now

Self-contained enough to open without another page first.

Learning paths

1 curated path currently include this deep dive.

Unlocks

Circuit Breakers (State Machines, Hysteresis, and Fast Failure)

idempotencyretriesexactly-oncebackoffoutboxdeduplication

Reliability Trade-off Advanced

Anti-Windup, Hysteresis, and Oscillation in Distributed Control Loops

Stabilize real control loops under delay and saturation: clamp integrators, separate thresholds, detect oscillation cheaply, and adapt gains before the system starts flapping.

Reliability lives inside Distributed systems , so broader distributed-systems trade-offs still matter here.

Builds on

Feedback Control for Autoscaling and Load Shedding

Learning paths

2 curated paths currently include this deep dive.

anti-winduphysteresisoscillationcontrol-loopsautoscalingstability

System Design Atlas

Navigate the system design atlas

All families

Traffic management

Control plane

Reliability

Traffic control core

Global policy enforcement

Control loops and stability

System design topics

Token Bucket, GCRA, and Virtual Time

Designing a Rate Limiter (at Scale, Production-Grade)

Global Quotas (Hierarchical Budgets Across Regions and Fleets)

Load Shedding (Protecting Latency Under Saturation)

Feature Flags Control Plane (Versioning, Distribution, and Safe Rollouts)

Distributed Locking (Leases, Fencing Tokens, and When Not to Use It)

Circuit Breakers (State Machines, Hysteresis, and Fast Failure)

Feedback Control for Autoscaling and Load Shedding

Idempotency and Retries (Without Multiplying Load)

Anti-Windup, Hysteresis, and Oscillation in Distributed Control Loops