1Which caching layer provides zero network latency on a cache hit?
Choosing the right caching layer -- browser, CDN, application server, or database -- determines caching effectiveness. Each layer offers different latency, capacity, invalidation characteristics, and scope. The best architectures cache data close to where it is consumed with an appropriate invalidation strategy.
The question of where to cache is as important as whether to cache. Data can be cached at multiple points along the request path, from the user's browser to the database's internal buffer pool. Each caching layer offers different characteristics: latency (how fast is a cache hit), capacity (how much data can be stored), scope (who shares the cache), invalidation (how stale data is removed), and complexity (how hard is it to implement and operate). Placing the cache at the wrong layer can be ineffective or counterproductive -- caching at the application server when the bottleneck is the database query cache, or caching at the CDN when the data is user-specific.
Browser caching is the closest cache to the user, with zero network latency on a hit. HTTP cache-control headers (max-age, no-cache, must-revalidate, stale-while-revalidate) give the server fine-grained control over what browsers cache and for how long. Service workers extend browser caching with programmable logic: caching API responses, implementing offline-first strategies, and pre-caching resources during idle time. Conditional requests using ETags and If-None-Match allow the browser to validate cached content without re-downloading it -- the server returns 304 Not Modified if the content has not changed, saving bandwidth. Browser caching is extremely effective for static assets (CSS, JS, images) and semi-static content (user preferences, configuration), but it is per-user and cannot be centrally invalidated without deploying new URLs (cache-busting via content hashes in filenames).
CDN caching sits between users and the origin server, serving cached content from geographically distributed edge nodes. CDNs excel at caching static assets and public API responses that are identical for all users. The Vary header allows CDNs to cache different versions based on request headers (Accept-Encoding, Accept-Language), enabling efficient caching of content that varies by client capability. CDN caching is powerful but invalidation is slow -- cache purge requests take seconds to minutes to propagate across all edge nodes. Application-level caching using Redis or Memcached stores computed results, aggregated data, and expensive query results close to the application logic. This is the most flexible caching layer because the application controls what is cached, how it is keyed, and when it is invalidated. Redis caching is appropriate for session state, rate limiting counters, feature flags, precomputed feeds, and any data that is expensive to compute from the database.
Database-level caching includes the buffer pool (InnoDB, PostgreSQL shared_buffers), which automatically caches frequently accessed data pages in memory, and the query cache (MySQL, now deprecated). The buffer pool is transparent to the application -- it automatically caches hot data pages and evicts cold ones using LRU. The MySQL query cache, which cached exact SQL query results, was deprecated in MySQL 8.0 because its invalidation overhead (invalidating all cached results for a table on any write to that table) made it counterproductive for write-heavy workloads. PostgreSQL's shared_buffers and the OS page cache together form an effective transparent caching layer that requires no application changes. The decision framework for where to cache follows a principle: cache data as close to the consumer as possible (browser > CDN > application > database), using the appropriate invalidation strategy for each layer.
The Grocery Shopping Analogy
Think of the request path like grocery shopping. Your pantry at home (browser cache) has ingredients you use every day -- instant access, no trip needed. The local convenience store (CDN) has common items within walking distance -- quick access for popular products. The full grocery store across town (application cache/Redis) has everything in stock -- wider selection but requires a drive. The wholesale warehouse (database) has bulk supplies -- largest selection but farthest away. You stock your pantry with daily essentials, the convenience store carries neighborhood favorites, and you only drive to the grocery store or warehouse for items not available closer to home.
Shopify
Shopify caches at every layer of the stack. Nginx microcaching (1-5 seconds) handles burst traffic for storefront pages. Redis caches computed product data, inventory counts, and cart state. Memcached stores session data and rate limiting state. A global CDN (Cloudflare + Fastly) caches static assets and full storefront pages with merchant-controlled cache-control headers. This multi-layer approach allows Shopify to handle flash sales (100x traffic spikes) without merchant infrastructure changes.
Twitter caches at every tier: mobile clients cache timelines locally for offline access and instant rendering. A CDN caches media (images, videos) and public profile data. Application servers use Redis clusters to cache computed timelines (fanout-on-read), user graph data, and tweet metadata. The database layer uses InnoDB buffer pools and read replicas. This comprehensive caching strategy enables Twitter to serve 500+ million tweets per day with sub-200ms timeline load times.
GitHub
GitHub uses conditional requests with ETags extensively. When a client requests a repository page, GitHub includes an ETag header. On subsequent requests, the client sends If-None-Match with the ETag. If the content has not changed, GitHub returns 304 Not Modified (no body, minimal bandwidth). For dynamic data like repository metadata and commit history, GitHub uses Redis for computed caches with cache-aside pattern. Static assets use CDN caching with content-hashed URLs for cache busting.
| Aspect | Description |
|---|---|
| Proximity to User vs Invalidation Control | Browser and CDN caches are closest to the user (lowest latency) but hardest to invalidate. Once content is in a browser cache, the server cannot force a refresh until the TTL expires or the URL changes. Application and database caches are farther from the user but offer immediate invalidation control. The trade-off is latency vs freshness. |
| Transparency vs Flexibility | Database buffer pools and ORM caches are transparent -- they require no application code changes and automatically cache hot data. Application caches (Redis) require explicit cache management code but offer complete control over what is cached, how it is keyed, and when it is invalidated. Transparent caching is simpler; explicit caching is more powerful. |
| Per-User vs Shared Caching | Browser caches are per-user and cannot serve data to other users. CDN and application caches are shared, serving cached data to all users who request the same resource. Per-user caching is effective for personalized content but duplicates storage. Shared caching is memory-efficient but cannot cache user-specific data without careful keying (user_id prefix). |
| Static vs Dynamic Content Cacheability | Static content (CSS, JS, images) is trivially cacheable at every layer with long TTLs and content-hash cache busting. Dynamic content (API responses, personalized pages) requires careful cache key design, shorter TTLs, and often cache-aside or write-through patterns. Attempting to cache dynamic content like static content leads to stale data bugs; under-caching dynamic content wastes database resources. |
Shopify's Multi-Layer Caching for Flash Sale Resilience
Scenario
Shopify powers over 2 million online stores, and flash sales (limited-time promotional events) can create 100x traffic spikes within seconds. A merchant announcing a sale on social media can drive millions of visitors to their store simultaneously. Without comprehensive caching, these spikes would overwhelm the shared infrastructure that serves all merchants, causing collateral damage to uninvolved stores.
Solution
Shopify implemented caching at every layer. At the CDN layer, Cloudflare and Fastly cache entire storefront pages for anonymous users with 5-second TTLs and stale-while-revalidate for 60 seconds. At the reverse proxy layer, Nginx microcaching caches responses for 1-5 seconds, absorbing request bursts that exceed CDN capacity. At the application layer, Redis caches product data, inventory counts, and pricing with per-merchant TTLs. At the database layer, read replicas and InnoDB buffer pools handle the remaining query load. Each layer independently handles its portion of the traffic, with cache misses at one layer caught by the next.
Outcome
Shopify handles flash sales serving 80,000+ requests per second per merchant without advance provisioning. The CDN layer absorbs 85-95% of traffic (static assets and full-page cache for anonymous users). Nginx microcaching catches 60-80% of the remaining dynamic requests. Redis handles most application-level data lookups. The actual database query load during a flash sale is only 2-5x normal (instead of 100x) because each caching layer peels off its share. Other merchants on the platform experience zero impact during another merchant's flash sale.
See Where to Cache: Client, CDN, Server, Database in action
Explore system design templates that use where to cache: client, cdn, server, database and run traffic simulations to see how these concepts perform under real load.
Browse Templates1Which caching layer provides zero network latency on a cache hit?
2Why was the MySQL query cache deprecated in MySQL 8.0?
3What HTTP mechanism allows a client to validate cached content without re-downloading it?