Vetora logo
πŸ—ΊοΈCaching

Where to Cache: Client, CDN, Server, Database

Choosing the right caching layer -- browser, CDN, application server, or database -- determines caching effectiveness. Each layer offers different latency, capacity, invalidation characteristics, and scope. The best architectures cache data close to where it is consumed with an appropriate invalidation strategy.

Overview

The question of where to cache is as important as whether to cache. Data can be cached at multiple points along the request path, from the user's browser to the database's internal buffer pool. Each caching layer offers different characteristics: latency (how fast is a cache hit), capacity (how much data can be stored), scope (who shares the cache), invalidation (how stale data is removed), and complexity (how hard is it to implement and operate). Placing the cache at the wrong layer can be ineffective or counterproductive -- caching at the application server when the bottleneck is the database query cache, or caching at the CDN when the data is user-specific.

Browser caching is the closest cache to the user, with zero network latency on a hit. HTTP cache-control headers (max-age, no-cache, must-revalidate, stale-while-revalidate) give the server fine-grained control over what browsers cache and for how long. Service workers extend browser caching with programmable logic: caching API responses, implementing offline-first strategies, and pre-caching resources during idle time. Conditional requests using ETags and If-None-Match allow the browser to validate cached content without re-downloading it -- the server returns 304 Not Modified if the content has not changed, saving bandwidth. Browser caching is extremely effective for static assets (CSS, JS, images) and semi-static content (user preferences, configuration), but it is per-user and cannot be centrally invalidated without deploying new URLs (cache-busting via content hashes in filenames).

CDN caching sits between users and the origin server, serving cached content from geographically distributed edge nodes. CDNs excel at caching static assets and public API responses that are identical for all users. The Vary header allows CDNs to cache different versions based on request headers (Accept-Encoding, Accept-Language), enabling efficient caching of content that varies by client capability. CDN caching is powerful but invalidation is slow -- cache purge requests take seconds to minutes to propagate across all edge nodes. Application-level caching using Redis or Memcached stores computed results, aggregated data, and expensive query results close to the application logic. This is the most flexible caching layer because the application controls what is cached, how it is keyed, and when it is invalidated. Redis caching is appropriate for session state, rate limiting counters, feature flags, precomputed feeds, and any data that is expensive to compute from the database.

Database-level caching includes the buffer pool (InnoDB, PostgreSQL shared_buffers), which automatically caches frequently accessed data pages in memory, and the query cache (MySQL, now deprecated). The buffer pool is transparent to the application -- it automatically caches hot data pages and evicts cold ones using LRU. The MySQL query cache, which cached exact SQL query results, was deprecated in MySQL 8.0 because its invalidation overhead (invalidating all cached results for a table on any write to that table) made it counterproductive for write-heavy workloads. PostgreSQL's shared_buffers and the OS page cache together form an effective transparent caching layer that requires no application changes. The decision framework for where to cache follows a principle: cache data as close to the consumer as possible (browser > CDN > application > database), using the appropriate invalidation strategy for each layer.

Key Points
  • 1Browser cache: zero-latency hits via HTTP cache-control headers. Best for static assets (JS, CSS, images) and semi-static content. Invalidation via cache-busting (content-hashed filenames). Service workers enable programmable caching strategies and offline support.
  • 2CDN cache: geographically distributed edge caching for static assets and public API responses. Vary header enables per-client-variant caching. Invalidation is slow (seconds to minutes for global purge). Best for content served to many users with infrequent changes.
  • 3Application cache (Redis/Memcached): the most flexible layer. Caches computed results, aggregated data, session state, and expensive query outputs. Application controls keying, TTL, and invalidation. Network latency (0.5-2ms) is the trade-off for flexibility.
  • 4ORM/query cache (Hibernate L2, prepared statements): caches database query results or entity objects within the application framework. Reduces database round trips for repeated queries. Hibernate L2 cache supports write-through and read-through patterns transparently.
  • 5Database buffer pool (InnoDB, shared_buffers): automatically caches frequently accessed data pages in memory. Transparent to the application, requires no code changes. Tuning buffer pool size is one of the most impactful database performance optimizations.
  • 6Decision framework: cache what is expensive to compute or fetch, place the cache close to where data is consumed, and match the invalidation strategy to the data's change frequency and consistency requirements.
Simple Example

The Grocery Shopping Analogy

Think of the request path like grocery shopping. Your pantry at home (browser cache) has ingredients you use every day -- instant access, no trip needed. The local convenience store (CDN) has common items within walking distance -- quick access for popular products. The full grocery store across town (application cache/Redis) has everything in stock -- wider selection but requires a drive. The wholesale warehouse (database) has bulk supplies -- largest selection but farthest away. You stock your pantry with daily essentials, the convenience store carries neighborhood favorites, and you only drive to the grocery store or warehouse for items not available closer to home.

Real-World Examples

Shopify

Shopify caches at every layer of the stack. Nginx microcaching (1-5 seconds) handles burst traffic for storefront pages. Redis caches computed product data, inventory counts, and cart state. Memcached stores session data and rate limiting state. A global CDN (Cloudflare + Fastly) caches static assets and full storefront pages with merchant-controlled cache-control headers. This multi-layer approach allows Shopify to handle flash sales (100x traffic spikes) without merchant infrastructure changes.

Twitter

Twitter caches at every tier: mobile clients cache timelines locally for offline access and instant rendering. A CDN caches media (images, videos) and public profile data. Application servers use Redis clusters to cache computed timelines (fanout-on-read), user graph data, and tweet metadata. The database layer uses InnoDB buffer pools and read replicas. This comprehensive caching strategy enables Twitter to serve 500+ million tweets per day with sub-200ms timeline load times.

GitHub

GitHub uses conditional requests with ETags extensively. When a client requests a repository page, GitHub includes an ETag header. On subsequent requests, the client sends If-None-Match with the ETag. If the content has not changed, GitHub returns 304 Not Modified (no body, minimal bandwidth). For dynamic data like repository metadata and commit history, GitHub uses Redis for computed caches with cache-aside pattern. Static assets use CDN caching with content-hashed URLs for cache busting.

Trade-Offs
AspectDescription
Proximity to User vs Invalidation ControlBrowser and CDN caches are closest to the user (lowest latency) but hardest to invalidate. Once content is in a browser cache, the server cannot force a refresh until the TTL expires or the URL changes. Application and database caches are farther from the user but offer immediate invalidation control. The trade-off is latency vs freshness.
Transparency vs FlexibilityDatabase buffer pools and ORM caches are transparent -- they require no application code changes and automatically cache hot data. Application caches (Redis) require explicit cache management code but offer complete control over what is cached, how it is keyed, and when it is invalidated. Transparent caching is simpler; explicit caching is more powerful.
Per-User vs Shared CachingBrowser caches are per-user and cannot serve data to other users. CDN and application caches are shared, serving cached data to all users who request the same resource. Per-user caching is effective for personalized content but duplicates storage. Shared caching is memory-efficient but cannot cache user-specific data without careful keying (user_id prefix).
Static vs Dynamic Content CacheabilityStatic content (CSS, JS, images) is trivially cacheable at every layer with long TTLs and content-hash cache busting. Dynamic content (API responses, personalized pages) requires careful cache key design, shorter TTLs, and often cache-aside or write-through patterns. Attempting to cache dynamic content like static content leads to stale data bugs; under-caching dynamic content wastes database resources.
Case Study

Shopify's Multi-Layer Caching for Flash Sale Resilience

Scenario

Shopify powers over 2 million online stores, and flash sales (limited-time promotional events) can create 100x traffic spikes within seconds. A merchant announcing a sale on social media can drive millions of visitors to their store simultaneously. Without comprehensive caching, these spikes would overwhelm the shared infrastructure that serves all merchants, causing collateral damage to uninvolved stores.

Solution

Shopify implemented caching at every layer. At the CDN layer, Cloudflare and Fastly cache entire storefront pages for anonymous users with 5-second TTLs and stale-while-revalidate for 60 seconds. At the reverse proxy layer, Nginx microcaching caches responses for 1-5 seconds, absorbing request bursts that exceed CDN capacity. At the application layer, Redis caches product data, inventory counts, and pricing with per-merchant TTLs. At the database layer, read replicas and InnoDB buffer pools handle the remaining query load. Each layer independently handles its portion of the traffic, with cache misses at one layer caught by the next.

Outcome

Shopify handles flash sales serving 80,000+ requests per second per merchant without advance provisioning. The CDN layer absorbs 85-95% of traffic (static assets and full-page cache for anonymous users). Nginx microcaching catches 60-80% of the remaining dynamic requests. Redis handles most application-level data lookups. The actual database query load during a flash sale is only 2-5x normal (instead of 100x) because each caching layer peels off its share. Other merchants on the platform experience zero impact during another merchant's flash sale.

Common Mistakes
  • ⚠Caching user-specific data at the CDN layer without proper keying. If a CDN caches a page containing user-specific data (shopping cart, account info), other users may see someone else's data. Use the Vary header or separate CDN caching rules for authenticated vs anonymous users.
  • ⚠Relying on the MySQL query cache for performance. MySQL deprecated the query cache in version 8.0 because it invalidated all cached queries for a table on any write, creating a global lock bottleneck. Use application-level caching (Redis) instead, which provides fine-grained per-key invalidation.
  • ⚠Over-caching at the browser without a cache-busting strategy. Long cache-control max-age values (e.g., 1 year) for CSS and JS files mean users may see stale assets after a deployment. Always use content-hashed filenames (app.a1b2c3.js) so new deployments use new URLs that bypass the browser cache.
  • ⚠Ignoring the database buffer pool as a caching layer. Tuning InnoDB buffer_pool_size (or PostgreSQL shared_buffers) to hold the working set in memory can provide 10-100x query speedups with zero application code changes. This is often the highest-impact caching optimization available.
Related Concepts

See Where to Cache: Client, CDN, Server, Database in action

Explore system design templates that use where to cache: client, cdn, server, database and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Compare client vs CDN vs server caching strategies

Metrics to watch
cache_hit_ratioorigin_traffic_pctp99_latency_msbandwidth_savings_pct
Run Simulation
Test Your Understanding

1Which caching layer provides zero network latency on a cache hit?

2Why was the MySQL query cache deprecated in MySQL 8.0?

3What HTTP mechanism allows a client to validate cached content without re-downloading it?

Deeper Reading