1In the cache-aside pattern, what happens when a cache miss occurs?
Cache-aside is the most widely used caching pattern. The application checks the cache first; on a miss, it queries the database, writes the result to the cache, and returns the data. Only requested data is cached, and cache failure degrades performance but does not cause errors.
Cache-aside, also known as lazy loading, is the most prevalent caching pattern in production systems. The application sits between the cache and the database, managing both explicitly. When a read request arrives, the application first checks the cache. On a cache hit, the data is returned immediately. On a cache miss, the application queries the database, writes the result to the cache for future requests, and then returns the data to the caller. The cache is populated lazily -- only data that is actually requested ends up in the cache, which avoids wasting memory on data nobody reads.
The primary advantage of cache-aside is its resilience. Because the application treats the cache as a helper rather than a requirement, a complete cache failure simply causes all requests to hit the database directly. The system becomes slower but does not return errors or incorrect data. This graceful degradation makes cache-aside suitable for systems where availability matters more than absolute performance. The pattern also avoids the cold-start problem of write-through caching, where newly written data populates the cache even if nobody ever reads it. With cache-aside, the cache naturally fills with the hottest data -- the entries that are actually requested.
The main disadvantage is the cache miss penalty. A cache miss involves three network round trips: one to the cache (miss), one to the database (fetch), and one back to the cache (write). This triple-hop latency is noticeable for latency-sensitive applications, especially when cache miss rates are high during cold starts or after cache flushes. Additionally, cache-aside does not automatically update the cache when the database changes. If another process writes to the database directly, the cached entry becomes stale until its TTL expires. This staleness window is the fundamental trade-off of cache-aside: you get simplicity and resilience at the cost of potential consistency gaps.
Several strategies mitigate cache-aside's weaknesses. Cache warming pre-populates the cache with frequently accessed data at startup, reducing the cold-start miss rate. TTL-based expiration ensures stale entries are eventually evicted, and shorter TTLs reduce the staleness window at the cost of lower hit rates. The N+1 cache problem -- where loading a list of items results in N individual cache lookups -- can be addressed by caching entire result sets or using multiget operations. Despite its trade-offs, cache-aside remains the default choice for most caching use cases because its simplicity, resilience, and natural hot-data selection are hard to beat.
The Library Desk Analogy
Imagine you have a small desk (cache) next to a large library (database). When someone asks for a book, you first check your desk. If the book is there, you hand it over immediately. If not, you walk to the library, find the book, bring it back to your desk for quick access later, and then hand it to the person. Your desk only holds books that people actually ask for, so popular books are always nearby. If your desk collapses, you can still get any book -- it just takes longer because you have to walk to the library every time.
Amazon
Amazon uses cache-aside extensively for its product catalog. Product details, pricing, and availability are cached in ElastiCache (Redis) with TTLs ranging from 30 seconds to 5 minutes depending on how frequently the data changes. On a cache miss, the application fetches from DynamoDB and populates the cache. During peak events like Prime Day, cache warming jobs pre-populate the top 100,000 products to prevent a cold-start stampede hitting DynamoDB.
Stack Overflow
Stack Overflow runs one of the most aggressive cache-aside implementations in production, caching nearly everything in Redis -- questions, answers, user profiles, vote counts, and computed rankings. Their cache hit rate exceeds 99% during normal operation. On cache miss, the application queries SQL Server and writes to Redis. The entire site can survive a Redis failure because every cache read has a fallback SQL query, though response times increase from sub-millisecond to 10-50ms.
GitHub
GitHub caches repository metadata (star counts, fork counts, README content, contributor lists) using a cache-aside pattern with Memcached and Redis. Repository data is fetched from MySQL on cache miss and written to cache with a 60-second TTL for frequently changing data (like star counts) and 5-minute TTLs for slowly changing data (like README content). This tiered TTL approach balances freshness against database load.
| Aspect | Description |
|---|---|
| Simplicity vs Consistency | Cache-aside is the simplest caching pattern to implement and reason about, but it provides no guarantee that cached data matches the database. Any write that bypasses the cache (direct DB update, batch job, another service) creates a staleness window. Applications that require strong read-after-write consistency need additional mechanisms like cache invalidation on writes. |
| Cache Miss Latency vs Memory Efficiency | The lazy-loading approach means only requested data enters the cache, maximizing memory efficiency. However, every first request for a given key pays the three-round-trip cache miss penalty. Eager loading (cache warming) reduces miss latency but wastes memory on data that may never be requested. |
| TTL Length vs Hit Rate | Short TTLs keep data fresh but reduce cache hit rates, pushing more traffic to the database. Long TTLs maximize hit rates but increase the window of potential staleness. The optimal TTL depends on the data's change frequency and the application's tolerance for stale reads. |
| Resilience vs Performance | Cache-aside's greatest strength is graceful degradation -- the system works without the cache, just slower. This resilience comes at the cost of the application managing two data sources (cache and DB) and handling misses, TTLs, and evictions explicitly, whereas patterns like read-through abstract this complexity. |
Stack Overflow's Multi-Layer Cache-Aside with Redis
Scenario
Stack Overflow serves over 1.7 billion page views per month with a remarkably small infrastructure footprint -- just a handful of web servers and two SQL Server instances. The challenge was reducing SQL Server load to a level where this minimal infrastructure could handle peak traffic without horizontal scaling.
Solution
Stack Overflow implemented an aggressive cache-aside pattern using Redis as the cache layer. Every database query has a corresponding cache key, and the application checks Redis before SQL Server on every read. Cache entries use carefully tuned TTLs: 60 seconds for vote counts, 5 minutes for question content, and 24 hours for user profile data. A custom cache invalidation framework invalidates specific keys when the corresponding data is written, reducing staleness. Multiget operations batch cache lookups for list pages, solving the N+1 cache problem.
Outcome
The cache hit rate exceeds 99%, reducing SQL Server query volume by roughly 100x. Stack Overflow handles its entire global traffic with two SQL Server instances, demonstrating that aggressive cache-aside can eliminate the need for complex distributed database architectures. The total infrastructure cost is a fraction of what comparable-traffic sites spend, and the system has maintained sub-100ms page load times for years.
See Cache-Aside (Lazy Loading) in action
Explore system design templates that use cache-aside (lazy loading) and run traffic simulations to see how these concepts perform under real load.
Browse Templates1In the cache-aside pattern, what happens when a cache miss occurs?
2What is the primary risk of using cache-aside without TTL-based expiration?
3Why is cache-aside considered more resilient than write-through caching?