Free lesson · 1 of 98 in the full path

The 30 Must-Know Concepts

30 min read

The Meeting Where Everyone Spoke a Secret Language

Your first architecture review at the new job. You have written solid backend code for six years, so you walk in confident.

Eleven minutes in, you have heard: "the consumer group rebalances and we lose ordering", "that read is eventually consistent so the replica might lie to you", "just put a Bloom filter in front of it", "no, Redlock won't survive a GC pause".

You understood every individual English word. You understood none of the sentences. And the worst part: when the lead asked for your thoughts, you said the code looked clean. The code. In an architecture review.

Here is the secret nobody tells you. The people in that room are not smarter than you. They just have a map. Every scary term slots into a small number of territories, and once you know the territories, a new term stops being terrifying. It is just a new village in a region you already know.

The biggest mistake you can make right now is trying to learn these concepts one by one, in order, like reading a dictionary from A to Z. If you try to deeply understand Kafka before you know what a basic message queue is, you will drown. This lesson hands you the map first. Depth comes later. That is literally what the next 19 weeks are for.

Why Should You Care?

  1. Interviews test connections, not definitions. "Why does your caching choice affect your consistency story?" is a map question. Candidates with 30 isolated flashcards fail it. Candidates with a connected map answer it casually.
  2. Every later lesson lands softer. When you reach Week 6 and meet cache stampedes, your brain will already have a shelf labeled "caching problems" to put it on. Pre-built shelves are how fast learners learn fast.
  3. You stop being bluffable. Vendors, blog posts, and overconfident teammates throw jargon as authority. With the map, you can ask the one question that matters: which problem, in which territory, does this actually solve?

🟢 The Simple Version: The Map Has Four Territories

The 30 Must-Know Concepts Mental Map SYSTEM DESIGN NetworkingDNS · LB · Proxies APIsREST · gRPC · Auth CachingCDN · Redis · TTL DatabasesSQL · NoSQL · Sharding Distributed SystemsConsensus · Clocks Async MessagingQueues · Pub/Sub Scale PatternsReplication · Partitioning OperationsDeploys · Observability Don't deep dive yet. Build the map so nothing feels alien later.
The 30 must-know concepts arranged as a mental map around system design
100%
The 30 Must-Know Concepts Mental Map SYSTEM DESIGN NetworkingDNS · LB · Proxies APIsREST · gRPC · Auth CachingCDN · Redis · TTL DatabasesSQL · NoSQL · Sharding Distributed SystemsConsensus · Clocks Async MessagingQueues · Pub/Sub Scale PatternsReplication · Partitioning OperationsDeploys · Observability Don't deep dive yet. Build the map so nothing feels alien later.

Think of the entire field as the Indian Railways network. One of the largest, most chaotic, yet remarkably functional systems in the world. Every system design concept you will ever meet does one of four jobs in that network.

Territory 1: Networking (the tracks and signals)

How data physically travels from A to B. If the tracks are broken, nothing else matters. In the railway, this is the steel tracks, the signals, and the routing that stops two trains from colliding.

Lives here: DNS (the phonebook turning swiggy.com into an IP), TCP vs UDP (guaranteed delivery vs throw-and-hope), load balancers (the traffic cops), CDNs (local warehouses so Mumbai users don't fetch files from Virginia), and proxies (who is being hidden, the client or the server?).

Territory 2: APIs & Communication (the languages and timetables)

If networking is the tracks, this territory is the station masters and the ticketing rules. Who boards, when, speaking which language.

Lives here: REST, GraphQL, gRPC (three dialects for asking services for things), API gateways and rate limiting (the bouncers at the door), webhooks, WebSockets and SSE (ways to push instead of poll), message queues (a waiting room where tasks sit until a worker is free), pub/sub (a broadcasting station, everyone interested listens), and idempotency (tapping "Pay" twice must not charge twice).

Territory 3: Databases & Storage (the vaults and freight yards)

The heaviest territory. Where data lives and how it survives growth. In the railway, these are the yards where trains are parked, maintained, and logged.

Lives here: the eternal SQL vs NoSQL debate (rigid reliable ledgers vs flexible scale), ACID (the promises a transaction makes), indexing with B-trees and LSM trees (how databases find things fast), replication (copies for safety and read scale), sharding (chopping data when one machine is not enough), consistent hashing (the elegant math of deciding which shard), and caching (keeping hot answers close because computing them again is expensive).

Territory 4: Distributed Systems (the coordination layer)

The art of making many machines act as one, and surviving the chaos when they disagree. In the railway, this is keeping the network running when the monsoon floods a track in Mumbai while a train derails outside Delhi.

Lives here: the CAP theorem (the forced choice when the network breaks), consistency models (how stale is acceptable?), consensus and Raft (electing a leader when half the cluster is dead), distributed locks (scarier than they look), sagas and distributed transactions (multi-step operations across services), and Bloom filters with their probabilistic cousins (fast approximate answers at planetary scale).

And two strips on the map's edge

  • Scale patterns: replication, partitioning, fan-out. Recurring moves that show up in every territory.
  • Operations: deployments, observability, circuit breakers. How you ship and run all of the above without losing your weekends.

🟡 Going Deeper: One Request Touches Half the Map

The territories are not isolated islands. Every real request stitches them together. Trace one Swiggy order and watch the map light up:

One Swiggy Order, Nine Concepts Phone"order biryani" DNSNetworking Load BalancerNetworking API Gatewayauth + rate limit Order ServiceREST (APIs) Redis Cachemenu price (DBs) Postgres (sharded)ACID order row Event queuepub/sub ×3 consumers payment runs with an idempotency key · a circuit breaker guards the gateway One request, nine concepts, four territories. "Design Swiggy" really means: walk this path and defend each stop.
One food-delivery request travels through DNS, load balancer, gateway, service, cache, database, and queue - touching every territory of the map
100%
One Swiggy Order, Nine Concepts Phone"order biryani" DNSNetworking Load BalancerNetworking API Gatewayauth + rate limit Order ServiceREST (APIs) Redis Cachemenu price (DBs) Postgres (sharded)ACID order row Event queuepub/sub ×3 consumers payment runs with an idempotency key · a circuit breaker guards the gateway One request, nine concepts, four territories. "Design Swiggy" really means: walk this path and defend each stop.
  1. Your phone asks DNS where api.swiggy.com lives (Networking)
  2. The request hits a load balancer, which picks a healthy server (Networking)
  3. The API gateway checks your auth token and your rate limit (APIs)
  4. The order service handles the REST call (APIs)
  5. The menu price comes from Redis cache, one database read avoided (Databases)
  6. The order row is written to a sharded Postgres inside a transaction. ACID earning its keep (Databases)
  7. An "order placed" event lands on a queue for the restaurant app, notifications, and analytics. Three consumers, so this is pub/sub (APIs/Distributed)
  8. Payment runs with an idempotency key, because Jio networks make people double-tap (Distributed)
  9. If the payment gateway hangs, a circuit breaker fails fast instead of dragging the whole site down (Operations)

Nine concepts, one biryani. This is why the map matters. In an interview, "design Swiggy" really means: walk this path and defend each stop.

The 30, As One Table

Your skim-first checklist. One line each. Definitions now, depth in the coming weeks.

# Concept One-liner Territory
1 DNS Name to IP phonebook, cached everywhere with TTLs Networking
2 TCP vs UDP Guaranteed-ordered vs fast-and-lossy delivery Networking
3 Load balancing Spread traffic; detect and route around dead servers Networking
4 CDN Cache content near users; physics sets the latency floor Networking
5 Proxy / reverse proxy Hide the client / hide the servers Networking
6 REST Resources + verbs + status codes over HTTP APIs
7 GraphQL Client picks the response shape; server pays for it APIs
8 gRPC Binary, contract-first, internal service-to-service speed APIs
9 API gateway One front door: auth, routing, rate limits APIs
10 Rate limiting Token bucket / sliding window; protect the backend APIs
11 AuthN vs AuthZ Who are you vs what may you do; sessions vs JWT APIs
12 Message queue Task waiting room; one consumer wins each task APIs
13 Pub/sub Broadcast; every subscriber gets the event APIs
14 WebSockets / SSE / polling Three ways to get live updates, three cost profiles APIs
15 Idempotency Same request twice = same effect once APIs
16 SQL vs NoSQL Joins and transactions vs flexible horizontal scale Databases
17 ACID Atomic, consistent, isolated, durable. The transaction promise Databases
18 Indexing Pre-sorted lookup structures; every index taxes writes Databases
19 B-tree vs LSM Read-optimized vs write-optimized storage engines Databases
20 Replication Copies of data; lag is the price Databases
21 Sharding Split data across machines; shard key choice is destiny Databases
22 Consistent hashing Add or remove a node, move only 1/N of the keys Databases
23 Caching Keep hot answers close; invalidation is the hard part Databases
24 CAP theorem During a partition: consistency or availability, pick one Distributed
25 Consistency models Strong, causal, read-your-writes, eventual Distributed
26 Consensus (Raft) Many machines agreeing on one truth, despite failures Distributed
27 Distributed locks & transactions Coordination across machines; sagas over 2PC Distributed
28 Bloom filters & friends Tiny memory, approximate answers, false positives allowed Distributed
29 Observability Logs, metrics, traces. Seeing inside the running system Operations
30 Resilience patterns Circuit breakers, retries, bulkheads. Failing gracefully Operations

Do not memorize this table. Bookmark it. Each row is a future lesson. Today's job is only this: none of these names should feel alien anymore.

🔴 Architect's Corner: How to Actually Use the Map

Concepts Cluster Into Recurring Arguments

Senior engineers do not think in 30 separate concepts. They think in a handful of arguments that repeat across territories:

  • The freshness argument. Caching, replication lag, consistency models, CDN TTLs. All of them are the same fight: how stale is acceptable, in exchange for what gain? Learn it once and you will recognize it everywhere.
  • The coordination argument. Distributed locks, consensus, transactions, idempotency. All of them ask: how do machines agree? And the recurring senior answer is to avoid needing agreement at all. Idempotency over locks. Sagas over 2PC.
  • The capacity argument. Load balancing, sharding, consistent hashing, queues. The work does not fit one machine, so split it, then manage what the split breaks.
  • The blast-radius argument. Circuit breakers, bulkheads, rate limits, dead letter queues. When something fails (not if, when), how far does the damage spread?

When an interviewer pushes you with "okay, now what if the cache dies?", they are testing whether you see the freshness and blast-radius arguments. Not whether you memorized Redis commands.

Learn in Priority Tiers, Not Alphabetical Order

Learn in Priority Tiers, Not Alphabetical Order HIGH: learn cold (Weeks 1–8)Load balancing · Caching · SQL vs NoSQL · Sharding · ReplicationCAP · Consistency · Consistent hashing · Queues vs Pub/Sub · Idempotency · REST MEDIUM: working knowledge (Weeks 9–14)gRPC/GraphQL · Rate limiting · Auth flows · B-tree vs LSM · ConsensusSagas · Bloom filters · Observability · Deployment patterns LOW: one-paragraph knowledge (on demand)Paxos internals · Vector clocks · Erasure coding · CRDTs · 3PC Going deep on everything is how people quit in week 3. Tiers are pacing strategy, not importance ranking. Finishing the path beats perfecting chapter 3.
The 30 concepts arranged in High / Medium / Low priority learning tiers
100%
Learn in Priority Tiers, Not Alphabetical Order HIGH: learn cold (Weeks 1–8)Load balancing · Caching · SQL vs NoSQL · Sharding · ReplicationCAP · Consistency · Consistent hashing · Queues vs Pub/Sub · Idempotency · REST MEDIUM: working knowledge (Weeks 9–14)gRPC/GraphQL · Rate limiting · Auth flows · B-tree vs LSM · ConsensusSagas · Bloom filters · Observability · Deployment patterns LOW: one-paragraph knowledge (on demand)Paxos internals · Vector clocks · Erasure coding · CRDTs · 3PC Going deep on everything is how people quit in week 3. Tiers are pacing strategy, not importance ranking. Finishing the path beats perfecting chapter 3.
  • High (learn cold, weeks 1 to 8): load balancing, caching, SQL vs NoSQL, sharding, replication, CAP, consistency models, consistent hashing, queues vs pub/sub, idempotency, REST. These appear in every interview and every real design.
  • Medium (working knowledge, weeks 9 to 14): gRPC and GraphQL, rate limiting, auth flows, B-tree vs LSM, consensus, sagas, Bloom filters, observability, deployment patterns. You should explain these confidently. You do not need to implement them at 2 AM.
  • Low (one-paragraph knowledge, on demand): vector clock internals, Paxos proofs, erasure coding, CRDTs, 3PC. Know the one-line use case and when to dig deeper. Finishing the path beats perfecting chapter 3.

Going deep on everything is how people quit in week 3. The tier list is a pacing strategy, not a ranking of importance.

The Map Is Also a Diagnosis Tool

Production incident triage is mostly territory identification:

  • "Site slow for everyone" → start in Networking (LB health, DNS, CDN) before blaming code.
  • "Slow for exactly one big customer" → Databases (hot shard? an index that does not fit their data shape?).
  • "Payments occasionally duplicated" → Distributed (idempotency, retries, queue redelivery).
  • "Fast in staging, dies at 6 PM" → the capacity argument (pool exhaustion, cache hit rate collapsing under real traffic).

Engineers without the map grep logs at random. Engineers with the map binary-search the territories.

Common Mistakes

1. "I'll learn each concept fully before moving to the next." Dictionary-style learning drowns you, because the concepts reference each other in a circle. Consistency needs replication, replication needs CAP, CAP needs consistency. Skim the full map first. Depth on the second pass. That is this entire path's design.

2. "More concepts = more senior." Seniority is connecting them. A mid-level engineer name-drops Kafka. A senior explains why a Postgres jobs table beats Kafka for this team at this scale. Interviewers downgrade jargon that does not connect to constraints.

3. "I know the definition, so I know the concept." The 4-column test from this path's tracker: can you define it, explain how it works, say when it breaks, and argue the tradeoff? Definitions are column one of four. Most people stop there, and that is why most people freeze in round two.

4. "New tech means new territories." The map is stable. Products churn. Kafka, RabbitMQ, SQS and NATS are all queue/pub-sub villages in the same region. When the next hot tool launches, your first question is "which village does this replace?", and suddenly the hype is readable.

🧠 Key Takeaways

  • Four territories hold every concept: Networking (data travels), APIs (services talk), Databases (data lives), Distributed Systems (machines agree). Plus scale patterns and operations at the edges.
  • One real request touches half the map. Interviews exploit this. Trace a Swiggy order and defend each stop.
  • Concepts cluster into recurring arguments: freshness, coordination, capacity, blast radius. Learn the argument once and recognize it in every costume.
  • Learn in priority tiers, not alphabetically. High tier cold, medium tier conversational, low tier on demand.
  • The map doubles as incident triage. Identify the territory before grepping logs.
  • Do not memorize the 30 today. Make them un-alien today. The next 19 weeks add the depth.

Think About It

  1. Trace a UPI payment (you scan a QR, money moves, both phones buzz) through the map the way this lesson traced the Swiggy order. Which territories does it touch that the food order did not? Which concept does the "both phones buzz" part depend on?

  2. A teammate proposes adding GraphQL because "REST is old". Using the territories and the recurring arguments, what are the first three questions you would ask before agreeing? Which argument (freshness, coordination, capacity, blast radius) does GraphQL actually affect?

  3. You are told a system is "eventually consistent". List the other concepts on the map that this single phrase quietly commits the system to: replication choices, cache behavior, what the client must tolerate. Then explain the user-visible consequence to a product manager in one sentence.

Further Reading

Quiz available inside the full course after you request access.