Vetora logo
📦Caching

Cache-Aside (Lazy Loading)

Cache-aside is the most widely used caching pattern. The application checks the cache first; on a miss, it queries the database, writes the result to the cache, and returns the data. Only requested data is cached, and cache failure degrades performance but does not cause errors.

Overview

Cache-aside, also known as lazy loading, is the most prevalent caching pattern in production systems. The application sits between the cache and the database, managing both explicitly. When a read request arrives, the application first checks the cache. On a cache hit, the data is returned immediately. On a cache miss, the application queries the database, writes the result to the cache for future requests, and then returns the data to the caller. The cache is populated lazily -- only data that is actually requested ends up in the cache, which avoids wasting memory on data nobody reads.

The primary advantage of cache-aside is its resilience. Because the application treats the cache as a helper rather than a requirement, a complete cache failure simply causes all requests to hit the database directly. The system becomes slower but does not return errors or incorrect data. This graceful degradation makes cache-aside suitable for systems where availability matters more than absolute performance. The pattern also avoids the cold-start problem of write-through caching, where newly written data populates the cache even if nobody ever reads it. With cache-aside, the cache naturally fills with the hottest data -- the entries that are actually requested.

The main disadvantage is the cache miss penalty. A cache miss involves three network round trips: one to the cache (miss), one to the database (fetch), and one back to the cache (write). This triple-hop latency is noticeable for latency-sensitive applications, especially when cache miss rates are high during cold starts or after cache flushes. Additionally, cache-aside does not automatically update the cache when the database changes. If another process writes to the database directly, the cached entry becomes stale until its TTL expires. This staleness window is the fundamental trade-off of cache-aside: you get simplicity and resilience at the cost of potential consistency gaps.

Several strategies mitigate cache-aside's weaknesses. Cache warming pre-populates the cache with frequently accessed data at startup, reducing the cold-start miss rate. TTL-based expiration ensures stale entries are eventually evicted, and shorter TTLs reduce the staleness window at the cost of lower hit rates. The N+1 cache problem -- where loading a list of items results in N individual cache lookups -- can be addressed by caching entire result sets or using multiget operations. Despite its trade-offs, cache-aside remains the default choice for most caching use cases because its simplicity, resilience, and natural hot-data selection are hard to beat.

Key Points
  • 1On a cache hit, the application returns data directly from the cache with no database query. On a miss, the app queries the database, writes the result to cache, and returns. The cache is populated only by actual read requests.
  • 2Cache-aside is fault-tolerant: if the cache goes down, the application falls back to querying the database directly. Performance degrades but correctness is maintained. This makes it safer than patterns where the cache is in the write path.
  • 3The cache miss penalty involves three network round trips (cache check, DB query, cache write), which can be significant for latency-sensitive applications. Batch operations and cache warming reduce this overhead.
  • 4Stale data is the primary consistency risk. If a different process updates the database, the cache retains the old value until TTL expires. Applications must choose a TTL that balances freshness against hit rate.
  • 5The N+1 cache problem occurs when loading a collection: fetching a list of 50 product IDs and then checking the cache for each one individually creates 50 cache lookups. Multiget operations and result-set caching mitigate this.
  • 6Cache warming at application startup pre-populates the cache with predictably hot data (e.g., top 1000 products), reducing the cold-start cache miss storm that would otherwise hit the database.
Simple Example

The Library Desk Analogy

Imagine you have a small desk (cache) next to a large library (database). When someone asks for a book, you first check your desk. If the book is there, you hand it over immediately. If not, you walk to the library, find the book, bring it back to your desk for quick access later, and then hand it to the person. Your desk only holds books that people actually ask for, so popular books are always nearby. If your desk collapses, you can still get any book -- it just takes longer because you have to walk to the library every time.

Real-World Examples

Amazon

Amazon uses cache-aside extensively for its product catalog. Product details, pricing, and availability are cached in ElastiCache (Redis) with TTLs ranging from 30 seconds to 5 minutes depending on how frequently the data changes. On a cache miss, the application fetches from DynamoDB and populates the cache. During peak events like Prime Day, cache warming jobs pre-populate the top 100,000 products to prevent a cold-start stampede hitting DynamoDB.

Stack Overflow

Stack Overflow runs one of the most aggressive cache-aside implementations in production, caching nearly everything in Redis -- questions, answers, user profiles, vote counts, and computed rankings. Their cache hit rate exceeds 99% during normal operation. On cache miss, the application queries SQL Server and writes to Redis. The entire site can survive a Redis failure because every cache read has a fallback SQL query, though response times increase from sub-millisecond to 10-50ms.

GitHub

GitHub caches repository metadata (star counts, fork counts, README content, contributor lists) using a cache-aside pattern with Memcached and Redis. Repository data is fetched from MySQL on cache miss and written to cache with a 60-second TTL for frequently changing data (like star counts) and 5-minute TTLs for slowly changing data (like README content). This tiered TTL approach balances freshness against database load.

Trade-Offs
AspectDescription
Simplicity vs ConsistencyCache-aside is the simplest caching pattern to implement and reason about, but it provides no guarantee that cached data matches the database. Any write that bypasses the cache (direct DB update, batch job, another service) creates a staleness window. Applications that require strong read-after-write consistency need additional mechanisms like cache invalidation on writes.
Cache Miss Latency vs Memory EfficiencyThe lazy-loading approach means only requested data enters the cache, maximizing memory efficiency. However, every first request for a given key pays the three-round-trip cache miss penalty. Eager loading (cache warming) reduces miss latency but wastes memory on data that may never be requested.
TTL Length vs Hit RateShort TTLs keep data fresh but reduce cache hit rates, pushing more traffic to the database. Long TTLs maximize hit rates but increase the window of potential staleness. The optimal TTL depends on the data's change frequency and the application's tolerance for stale reads.
Resilience vs PerformanceCache-aside's greatest strength is graceful degradation -- the system works without the cache, just slower. This resilience comes at the cost of the application managing two data sources (cache and DB) and handling misses, TTLs, and evictions explicitly, whereas patterns like read-through abstract this complexity.
Case Study

Stack Overflow's Multi-Layer Cache-Aside with Redis

Scenario

Stack Overflow serves over 1.7 billion page views per month with a remarkably small infrastructure footprint -- just a handful of web servers and two SQL Server instances. The challenge was reducing SQL Server load to a level where this minimal infrastructure could handle peak traffic without horizontal scaling.

Solution

Stack Overflow implemented an aggressive cache-aside pattern using Redis as the cache layer. Every database query has a corresponding cache key, and the application checks Redis before SQL Server on every read. Cache entries use carefully tuned TTLs: 60 seconds for vote counts, 5 minutes for question content, and 24 hours for user profile data. A custom cache invalidation framework invalidates specific keys when the corresponding data is written, reducing staleness. Multiget operations batch cache lookups for list pages, solving the N+1 cache problem.

Outcome

The cache hit rate exceeds 99%, reducing SQL Server query volume by roughly 100x. Stack Overflow handles its entire global traffic with two SQL Server instances, demonstrating that aggressive cache-aside can eliminate the need for complex distributed database architectures. The total infrastructure cost is a fraction of what comparable-traffic sites spend, and the system has maintained sub-100ms page load times for years.

Common Mistakes
  • Not setting a TTL on cached entries. Without TTL, stale data lives in the cache indefinitely. Every cache-aside entry should have a TTL, even if it is long (24 hours), to ensure eventual refresh and prevent unbounded memory growth.
  • Caching null results without protection. When a cache miss also misses the database (the key does not exist), caching the null result prevents repeated DB lookups. But without a short TTL on null entries, the cache will permanently mask data that is later inserted into the database.
  • Ignoring the thundering herd on cache expiry. When a popular cache entry expires, hundreds of concurrent requests all miss the cache and hit the database simultaneously. A mutex or single-flight pattern ensures only one request refreshes the cache while others wait.
  • Using cache-aside for data that requires strong read-after-write consistency. If a user updates their profile and immediately views it, cache-aside may serve the stale pre-update version. Pair cache-aside with cache invalidation on writes for user-facing mutation flows.
Related Concepts

See Cache-Aside (Lazy Loading) in action

Explore system design templates that use cache-aside (lazy loading) and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Simulate cache-aside with configurable TTLs and miss rates

Metrics to watch
cache_hit_ratedb_query_ratep99_latency_ms
Run Simulation
Test Your Understanding

1In the cache-aside pattern, what happens when a cache miss occurs?

2What is the primary risk of using cache-aside without TTL-based expiration?

3Why is cache-aside considered more resilient than write-through caching?

Deeper Reading