CDN, Load Balancer & Reverse Proxy
CDN, Load Balancer & Reverse Proxy
LinkedIn Hook
Your app works great with 10 users. Then it gets featured on Product Hunt and suddenly 10,000 people hit it at once.
Three technologies decide whether your app survives or crashes:
- CDN — serves your static files from a server 5ms away from the user, not 200ms
- Load Balancer — spreads 10,000 requests across 10 servers instead of crushing 1
- Reverse Proxy — sits in front of everything, adds security, caching, and routing
Most developers use all three without knowing how they work. Interviewers always ask.
Lesson 8.5 — the final lesson in my Networking Interview Prep series — covers all three: what they do, how they differ, and how they work together in a production architecture.
Read the full lesson → [link]
#CDN #LoadBalancer #ReverseProxy #SystemDesign #NetworkingFundamentals #InterviewPrep #WebDevelopment
What You'll Learn
- What a CDN does and how edge servers reduce latency globally
- How a load balancer distributes traffic across multiple servers
- The most common load balancing algorithms and when to use each
- What a reverse proxy is and how it differs from a forward proxy
- How Nginx can act as both a reverse proxy and a load balancer
- How CDN, load balancer, and reverse proxy fit together in a real architecture
- What sticky sessions and session affinity are and when they matter
The Library System Analogy
Imagine you want a book. The main library has every book, but it is an hour away.
-
CDN: The city opens small branch libraries in every neighborhood. Common books are copied and stocked there. You get the book in 5 minutes, not 60. The central library is only consulted for rare books or new editions.
-
Load balancer: The central library has 10 librarians at the desk. Instead of everyone rushing to one librarian and waiting, a manager (load balancer) directs each visitor to whichever librarian is currently least busy.
-
Reverse proxy: The front desk receptionist. Visitors talk to the receptionist. The receptionist routes them to the right floor, checks their library cards, logs their visits, and handles common requests without bothering the librarians at all.
None of these are the librarians themselves — they are infrastructure around the librarians that makes the system work at scale.
CDN — Content Delivery Network
A CDN is a geographically distributed network of servers called edge nodes or PoPs (Points of Presence). CDNs store copies of your static content close to users worldwide.
How CDN Works
Without CDN:
User in Tokyo → request → Origin Server in New York (150ms RTT)
With CDN:
User in Tokyo → request → CDN Edge Node in Tokyo (2ms RTT)
[cache hit: serve from local cache]
[cache miss: fetch from New York, cache, serve]
Cache hit: The edge node has the file → serves instantly with no round-trip to origin. Cache miss: The edge node fetches from origin, caches it, then serves it. First user pays the origin latency; all subsequent users get it from the edge.
What CDNs Cache
Good candidates for CDN caching:
✅ Static assets: HTML, CSS, JavaScript bundles
✅ Images, fonts, videos
✅ Public API responses with long cache lifetimes
✅ Software downloads
NOT good for CDN:
❌ Authenticated user data (user-specific dashboard, private profiles)
❌ Frequently changing dynamic content (live stock prices)
❌ Responses that depend on request body (POST requests)
CDNs cache based on the URL and the Cache-Control header you set on the response.
# Long-lived: JS bundle with content hash in filename
Cache-Control: public, max-age=31536000, immutable
# → CDN caches for 1 year (safe because hash changes when content changes)
# Short-lived: homepage HTML (changes with deployments)
Cache-Control: public, max-age=300
# → CDN caches for 5 minutes
# Not cached: user-specific API data
Cache-Control: private, no-store
# → CDN does not cache; every request goes to origin
CDN Cache Invalidation
After a deployment, you might need to clear CDN caches immediately rather than waiting for TTL expiry. This is "cache invalidation" — major CDN providers (Cloudflare, AWS CloudFront, Fastly) provide APIs to purge specific paths or all caches.
# Cloudflare API: purge specific file
curl -X DELETE "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer {api_token}" \
--data '{"files":["https://example.com/app.bundle.js"]}'
The professional pattern: name static assets with a content hash in the filename (app.a3b2c1.js). The hash changes whenever content changes, making the old URL unreachable and the new URL uncached — zero need for cache invalidation.
Load Balancer — Distributing Traffic
A load balancer sits in front of multiple server instances and distributes incoming requests across them. It solves two problems:
- Scalability: handle more traffic by adding more servers
- Availability: if one server fails, others keep serving
Without load balancer:
10,000 req/sec → Server A → crashes
With load balancer:
10,000 req/sec → Load Balancer → Server A (3,333 req/sec)
→ Server B (3,333 req/sec)
→ Server C (3,334 req/sec)
Load Balancing Algorithms
Round Robin — requests go to servers in rotation: A, B, C, A, B, C...
- Simple, works well when all servers have equal capacity
- Problem: a long-running request on server A does not reduce its future load
Weighted Round Robin — servers get traffic proportional to their weight
- Server A (weight 3), Server B (weight 1) → A gets 75%, B gets 25%
- Use when servers have different hardware capacity
Least Connections — next request goes to the server with the fewest active connections
- Better than round robin when requests vary wildly in processing time
- A server holding 3 long connections gets no new requests until load drops
IP Hash — hash the client's IP to always route to the same server
- Same client always hits same server (predictable routing)
- Used when server-side state (session data stored in memory) is needed
- Problem: uneven distribution if many clients share an IP (corporate NAT)
Random — randomly select a server
- Simple, surprisingly effective with enough servers (law of large numbers)
# Nginx load balancer configuration
upstream api_servers {
# Round robin (default)
server 10.0.0.1:3000;
server 10.0.0.2:3000;
server 10.0.0.3:3000;
}
upstream api_servers_weighted {
# Weighted round robin
server 10.0.0.1:3000 weight=3; # gets 3x the traffic
server 10.0.0.2:3000 weight=1;
}
upstream api_servers_least_conn {
least_conn; # least connections algorithm
server 10.0.0.1:3000;
server 10.0.0.2:3000;
}
upstream api_servers_ip_hash {
ip_hash; # sticky sessions by IP
server 10.0.0.1:3000;
server 10.0.0.2:3000;
}
server {
listen 80;
location /api/ {
proxy_pass http://api_servers;
}
}
Health Checks
Load balancers continuously probe backend servers to detect failures:
upstream api_servers {
server 10.0.0.1:3000;
server 10.0.0.2:3000;
# Nginx Plus / equivalent:
# health_check interval=5s fails=3 passes=2;
# Basic Nginx: uses passive health checks (marks server down after failures)
}
When a server fails health checks, the load balancer stops routing to it. When it recovers, it is added back. Users never see errors from a failed server (if other instances are healthy).
Sticky Sessions
Some applications store session state in memory (not in a shared database). If user A's session is on server #1, their next request must also go to server #1 — otherwise server #2 has no record of the session.
Sticky sessions (also called session affinity) bind a user's requests to the same server:
- Cookie-based: load balancer sets a cookie indicating which server to use
- IP hash: route based on client IP (less precise due to NAT)
User A → request 1 → Load Balancer → Server 1 (session created)
User A → request 2 → Load Balancer → Server 1 (same server — session found)
User A → request 3 → Load Balancer → Server 1 (always server 1)
Sticky sessions are a workaround, not a best practice. The proper solution is to store session data in a shared store (Redis, database) that all server instances can access, enabling any server to handle any request. Sticky sessions break when the pinned server goes down.
Reverse Proxy
A reverse proxy is a server that sits in front of one or more backend servers. Clients talk to the reverse proxy; they never communicate directly with backend servers.
┌─────────────────────────────────┐
Client ──→│ Reverse Proxy │──→ Backend Server A
│ (Nginx, Caddy, HAProxy, etc.) │──→ Backend Server B
└─────────────────────────────────┘
↑ all clients see this
What a Reverse Proxy Does
SSL termination: The reverse proxy handles TLS. Backend servers receive plain HTTP. This centralizes certificate management — backend servers do not need certificates.
Caching: The proxy caches common responses. Repeated requests for the same resource are served from cache without hitting backend servers.
Compression: The proxy gzip/brotli compresses responses before sending to clients. Backends can send uncompressed data; compression happens once at the proxy.
Request routing: Route /api/ to Node.js servers, /static/ to Nginx's own static file server, /admin/ to a different backend entirely — all on the same domain.
Security: Backend IPs are never exposed to the internet. The proxy filters malicious requests before they reach application code. Rate limiting, IP allowlisting, and WAF rules live here.
Logging and observability: All incoming requests are logged in one place, regardless of which backend handled them.
# Nginx as reverse proxy — production configuration
server {
listen 443 ssl http2;
server_name example.com;
# SSL termination (backend servers get plain HTTP)
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
# Gzip compression (backend doesn't need to compress)
gzip on;
gzip_types text/plain application/json text/css application/javascript;
# Static files served directly (no backend hit)
location /static/ {
root /var/www;
expires 1y;
add_header Cache-Control "public, immutable";
}
# API requests forwarded to Node.js
location /api/ {
proxy_pass http://api_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# WebSocket support
location /ws/ {
proxy_pass http://api_servers;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
}
Forward Proxy vs Reverse Proxy
The naming confuses almost everyone. Here is the clearest distinction:
| Forward Proxy | Reverse Proxy | |
|---|---|---|
| Who configures it | The client | The server owner |
| Who it serves | Clients (hides client from server) | Servers (hides server from client) |
| Client awareness | Client knows about it (configures it) | Client does not know it exists |
| Examples | VPNs, Squid proxy, corporate firewalls | Nginx, Caddy, HAProxy, Cloudflare |
| Use case | Privacy, content filtering, bypassing geo-blocks | Load balancing, SSL termination, caching |
How They Work Together — A Real Architecture
In production, CDN, load balancer, and reverse proxy stack in layers:
User's Browser
↓
CDN Edge Node (Cloudflare / CloudFront)
[Cache hit → serve static content instantly]
[Cache miss → forward to origin]
↓
Load Balancer (AWS ALB / Nginx / HAProxy)
[Distributes across server instances]
[Health checks, SSL termination]
↓
Reverse Proxy on each instance (Nginx)
[Routes /api → Node.js, /static → filesystem]
[Compression, request logging]
↓
Application Server (Node.js / Django / Rails)
↓
Database (PostgreSQL / Redis)
Request flow for a dynamic API call:
- User requests
https://example.com/api/users - CDN sees
/api/— not cacheable, passes through to origin - Load balancer receives request, picks the least-loaded application server
- Nginx on that server receives plain HTTP (TLS was terminated at the load balancer), routes to Node.js process
- Node.js queries database, assembles response
- Response travels back through Nginx → load balancer → CDN → user
Request flow for a static asset:
- User requests
https://example.com/static/app.a3b2c1.js - CDN has this file in edge cache → returns it instantly with
Age: 3421header - Origin never involved
Common Mistakes
-
Putting dynamic, user-specific content behind a CDN without
Cache-Control: private. If a CDN caches a response containing user A's private data and user B requests the same URL, user B gets user A's data. Always setCache-Control: privateorno-storefor authenticated/personalized content. Never let a CDN cache responses with session cookies or user-specific data. -
Using sticky sessions as a long-term session strategy. Sticky sessions mean one server going down causes session loss for all its users. The correct fix is external session storage (Redis) so any server instance can reconstruct any session. Sticky sessions are acceptable for short-lived state during a migration, not as a permanent architecture choice.
-
Forgetting
X-Forwarded-Forbehind a reverse proxy. When Nginx sits in front of Node.js, the application sees the proxy's IP, not the user's IP. To preserve the real client IP, Nginx must setX-Forwarded-Forand the application must read it. Without this, IP-based rate limiting, geo-filtering, and security logging all use the wrong IP.
Interview Questions
Q: What is the difference between a CDN and a reverse proxy?
A CDN is a geographically distributed cache network — its primary value is geographic proximity. By storing copies of content in edge nodes around the world, CDNs reduce the physical distance (and thus latency) between users and content. A reverse proxy is a local intermediary that sits in front of one or more backend servers. It handles SSL termination, request routing, caching, and compression — but its value is about abstracting and protecting the backend, not geographic distribution. A CDN operates globally at the network edge; a reverse proxy operates locally within your infrastructure. In practice, a CDN provider like Cloudflare also acts as a reverse proxy for your entire origin.
Q: What are the main load balancing algorithms and when do you choose each?
Round robin: requests rotate across servers in order — good when all servers are equivalent. Weighted round robin: servers receive traffic proportional to their weight — good when servers have different capacities. Least connections: new requests go to the server with fewest active connections — best when request duration varies widely (some requests take seconds, others milliseconds). IP hash: the client's IP maps to a fixed server — gives sticky sessions by IP but distributes unevenly if many users share an IP (corporate NAT). The most commonly recommended general-purpose algorithm is least connections, as it adapts to actual server load rather than assuming all requests are equal.
Q: What is SSL termination and why do reverse proxies do it?
SSL termination means the reverse proxy decrypts the incoming HTTPS connection and forwards plain HTTP to backend servers. This centralizes TLS certificate management — only the reverse proxy needs a certificate; all backend servers communicate unencrypted within the private network. It also offloads the computational overhead of TLS encryption/decryption from application servers. The risk: if the internal network between the proxy and backends is untrusted, the unencrypted traffic could be intercepted. In high-security environments, "SSL passthrough" or end-to-end TLS is used instead.
Q: How does a CDN decide whether to serve from cache or go to origin?
CDNs cache based on URL and HTTP response headers. On each request, the edge node checks its cache for a stored response matching the URL. A cache hit serves immediately. A cache miss fetches from origin, stores the response (if
Cache-Controlpermits caching), and serves it. Cache freshness is controlled byCache-Control: max-ageorExpiresheaders from the origin.Cache-Control: privateorno-storeinstructs the CDN not to cache. Custom cache rules at the CDN level can override origin headers — for example, forcing a long TTL on all.jsfiles regardless of what the origin says.
Quick Reference — Cheat Sheet
The Three Infrastructure Pieces
| CDN | Load Balancer | Reverse Proxy | |
|---|---|---|---|
| Primary purpose | Geographic distribution | Traffic distribution | Backend abstraction |
| Lives | At the network edge globally | In front of server fleet | In front of each server / fleet |
| Caches content | Yes (main feature) | No | Optional (Nginx can) |
| SSL termination | Yes | Yes | Yes |
| Health checks | CDN can probe origin | Yes | Not typically |
| Examples | Cloudflare, CloudFront, Fastly | AWS ALB, HAProxy, Nginx, ELBs | Nginx, Caddy, Traefik |
| Best for | Static assets, global reach | Horizontal scaling | Routing, security, TLS |
Load Balancing Algorithm Decision Guide
All servers equal capacity?
YES → Round Robin
NO → Weighted Round Robin
Requests vary in duration?
YES → Least Connections
NO → Round Robin
Need same user → same server?
YES → IP Hash (or cookie-based sticky)
NO → Any other algorithm (preferred for reliability)
Cache-Control for CDN
| Directive | CDN behavior |
|---|---|
public, max-age=N | CDN caches for N seconds |
private | CDN must not cache |
no-store | CDN must not cache |
no-cache | CDN can cache but must revalidate |
s-maxage=N | Shared cache (CDN) TTL, overrides max-age for CDNs |
immutable | CDN can serve without revalidation for max-age duration |
Production Architecture Pattern
[User] → [CDN Edge] → [Load Balancer] → [Nginx Reverse Proxy] → [App Server] → [DB]
cache distribute route/SSL/compress business logic
Static asset path: [User] → [CDN Edge] ✓ cache hit → done
Dynamic data path: [User] → [CDN Edge] → miss → [LB] → [Nginx] → [App] → [DB]
Previous: Lesson 8.4 → Next: Course Complete — Congratulations! 🎉
This is Lesson 8.5 of the Networking Interview Prep Course — 8 chapters, 32 lessons.