Production-standard Pastebin architecture using PostgreSQL for durable storage and Redis cache-aside for hot paste reads. Achieves 90% cache hit rate, reducing database load by 10x. The go-to approach for moderate-scale text sharing services.
The RDBMS with Cache approach is the production-standard architecture for Pastebin-style services at moderate scale. It solves the fundamental bottleneck of the naive approach — every read hitting the database — by introducing a Redis cache-aside layer that absorbs 90% of read traffic. This is the architecture most interviewers expect candidates to arrive at after identifying the database bottleneck in the naive design.
The system handles the same workload as the naive variant: 1 million new pastes per day and 10 million views per day, with a 10:1 read-to-write ratio. The critical difference is how reads are served. Instead of every read querying PostgreSQL (12ms average latency), 90% of reads hit Redis first (2ms average latency). Only the remaining 10% — cold pastes not in cache — fall through to PostgreSQL. This reduces database read load from 4,500 reads/sec at peak to approximately 450 reads/sec, well within a single PostgreSQL instance's capacity.
The cache-aside pattern works as follows: on paste creation, PasteService writes the paste to both PostgreSQL (durable storage) and Redis (immediate cache population). On paste read, PasteService checks Redis first. On cache hit (90% of the time), the paste is returned in approximately 2ms. On cache miss, PasteService queries PostgreSQL (approximately 12ms), writes the result back to Redis for subsequent reads, and returns the paste. Redis entries have TTL matching the paste's configured expiry, so expired pastes auto-evict from cache.
The 90% cache hit rate is driven by the Zipfian access pattern inherent to Pastebin-style services: recently created pastes receive the vast majority of views. When someone creates a paste and shares the URL, the recipients typically view it within minutes to hours. The write-through on creation ensures the paste is in Redis before the first read arrives. Older pastes that fall out of cache via LRU eviction are rarely accessed, so the miss penalty is infrequent.
This architecture handles up to approximately 5,000 sustained RPS — a 10x improvement over the naive approach. The database is no longer the bottleneck; instead, the ceiling is determined by Redis capacity and the remaining 10% of reads that hit PostgreSQL. For most Pastebin-style services with moderate traffic, this is sufficient. When traffic exceeds 5K RPS, or when paste sizes regularly exceed 8KB (causing PostgreSQL TOAST overhead), the Object Storage variant with S3 and DynamoDB provides the next level of scalability.
This template demonstrates the most important caching pattern in system design interviews: cache-aside with write-through. Candidates who can articulate why the 90% hit rate works (Zipfian access), how cache invalidation is handled (TTL-based), and when this approach stops being sufficient (large pastes, global scale) demonstrate strong system design fundamentals.
The RDBMS with Cache Pastebin uses five components organized in a linear pipeline with a cache branch: Client, API Gateway, Load Balancer, PasteService, Redis Cache, and PostgreSQL Database. The API Gateway handles authentication and rate limiting. The Load Balancer distributes traffic across PasteService pods. PasteService is the central coordinator that routes between Redis and PostgreSQL based on cache hit/miss.
All traffic enters through the API Gateway, which performs JWT authentication (approximately 3ms), rate limiting (8,000 RPS cap with headroom), and request validation. The API Gateway routes all valid requests to the Load Balancer, which distributes across PasteService pods using round-robin. The Load Balancer adds approximately 1.5ms of routing latency and has 15,000 RPS capacity — 3x headroom above peak traffic.
PasteService runs on 4 pods with 100 threads each, providing 400 concurrent request capacity. The write path is straightforward: generate a UUID-based paste ID, persist to PostgreSQL (approximately 50ms INSERT), write to Redis cache (approximately 2ms SET), and return the paste URL. The key optimization is the write-through to Redis — the paste is available in cache before the first reader arrives, preventing a thundering-herd cache miss on newly created popular pastes.
The read path implements the cache-aside pattern. PasteService first checks Redis using the paste_id as the cache key. On hit (90% of reads), the paste content and metadata are returned from Redis in approximately 2ms. On miss (10% of reads), PasteService queries PostgreSQL by paste_id (approximately 12ms indexed read), checks TTL expiry, validates password if protected, writes the result back to Redis (backfill), and returns the content. The backfill ensures subsequent reads for the same paste hit the cache.
Redis runs on 2 nodes for high availability with 13GB memory each. The cache key pattern is paste:{id} mapping to paste content plus metadata. LRU eviction handles memory pressure when the working set exceeds cache capacity. Each cache entry has a Redis TTL matching the paste's configured expiry, so expired pastes auto-evict without explicit invalidation. The expected working set is approximately 4,500MB — comfortably within the 13GB per-node capacity.
PostgreSQL stores the pastes table with a B-tree index on paste_id. With 90% of reads absorbed by Redis, the database handles only approximately 450 reads/sec at peak — easily manageable for a single primary with 2 read replicas. The 2 replicas provide redundancy and additional read capacity for cache-miss scenarios. Background cleanup jobs purge expired rows nightly, as PostgreSQL has no native TTL mechanism.
Choice
Check Redis first on reads, fall through to PostgreSQL on miss, backfill cache
Rationale
The 10:1 read-to-write ratio means caching is essential. At 4,500 peak reads/sec, hitting PostgreSQL directly would require many read replicas. Redis cache-aside with 90% hit rate reduces DB reads to ~450/sec — easily handled by a single primary with 2 replicas. The Zipfian access pattern ensures recently created pastes (the most commonly viewed) are always in cache.
Choice
Write to both PostgreSQL and Redis on paste creation
Rationale
When a user creates a paste and shares the URL, recipients typically view it within minutes. Writing to Redis on creation ensures the paste is cached before the first read arrives. Without write-through, the first reader would experience a cache miss — and for popular pastes, many concurrent readers would all miss simultaneously (thundering herd).
Choice
Store paste content inline in the pastes table, relying on TOAST for large pastes
Rationale
At 5KB average paste size, PostgreSQL handles content efficiently without the complexity of a separate object store. TOAST transparently compresses and stores pastes exceeding 8KB. This single-database approach is simpler than a two-system design (metadata DB + S3) for services where most pastes are small.
Choice
Redis TTL matches each paste's configured expiry
Rationale
Instead of explicit cache invalidation on paste expiry, Redis entries have a TTL matching the paste's expires_at timestamp. Expired pastes auto-evict from cache without any application logic. This eliminates the need for a cache invalidation service or pub/sub mechanism. The application-level expiry check in PasteService is a safety net for clock skew.
Choice
All reads served through Redis cache, no edge caching
Rationale
Unlike URL shorteners where a viral link gets millions of hits, most Pastebin pastes have a small audience (shared among a team or friends). The long-tail access pattern means CDN hit rate would be low — most pastes are viewed a few times and forgotten. Redis cache-aside is more cost-effective for this access pattern. The Object Storage variant adds a CDN for the rare viral paste scenario.
Target RPS
~5K sustained RPS
Latency (p99)
<5ms cache hit (90%), 12ms cache miss (10%)
Storage
~2 TB/year (PostgreSQL + Redis)
Availability
99.9% (Redis HA, DB replicas)
| Operation | Time | Space | Notes |
|---|---|---|---|
| Create paste (POST /api/v1/pastes) | O(1) DB INSERT + O(1) cache SET | O(1) per paste (~5KB in DB + ~5KB in cache) | ~67ms total (50ms DB + 2ms cache + 15ms network overhead). Write-through ensures immediate cache availability. |
| Read paste — cache hit (GET /api/v1/pastes/{id}) | O(1) cache GET | O(1) per read | ~12ms total (2ms cache + 10ms network). 90% of reads follow this path. |
| Read paste — cache miss (GET /api/v1/pastes/{id}) | O(log N) B-tree seek + O(1) cache SET backfill | O(1) per read + O(1) cache entry created | ~32ms total (5ms cache miss + 12ms DB + 2ms cache backfill + 13ms network). 10% of reads follow this path. |
Durable store for all paste content with TTL-based expiry. Write-once on paste creation (~500 inserts/sec peak), read on cache miss (~450 reads/sec with 90% cache hit). B-tree index on paste_id for fast indexed lookups (~12ms). TOAST handles large pastes (>8KB) transparently.
Indexes: idx_pastes_pkey ON (paste_id) — B-tree, primary lookup path, idx_pastes_expires ON (expires_at) — for background cleanup DELETE queries
Grows ~5GB/day before expiry cleanup. Background nightly job purges expired rows. 2 read replicas handle cache-miss reads.
Cache-aside store for hot paste content and metadata. Written on paste creation (write-through) and on cache miss (backfill). LRU eviction with TTL matching each paste's configured expiry. Expected 90% hit rate with Zipfian access pattern.
Working set ~4,500MB across 2 nodes (13GB each). Redis TTL auto-evicts expired pastes. No explicit invalidation needed.
| Variant | Tier | Latency | Throughput | Cost | Complexity | Reliability |
|---|---|---|---|---|---|---|
| Naive (Single Service + SQL) | T1 | 12ms-200ms+ reads | ~500 sustained RPS | $200/month (single DB + service) | Low — 4 components, no cache | 99% (single DB, no failover) |
| RDBMS with Cache (Postgres + Redis) | T2 | <5ms cache hit, 12ms miss | ~5K sustained RPS | $600/month (DB + Redis + service) | Medium — cache-aside pattern | 99.9% (Redis HA, DB replicas) |
| Object Storage + NoSQL TTL | T3 | <10ms CDN hit, ~80ms full path | 10K+ sustained RPS | $1,200/month (S3 + DynamoDB + CDN + Redis) | High — 7 components, two-system storage | 99.95% (S3 durability, DynamoDB HA) |
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
Pastebin access follows a Zipfian distribution: recently created pastes receive the vast majority of views. When someone creates a paste and shares the URL via Slack, email, or social media, recipients view it within minutes to hours. The write-through on creation ensures the paste is in Redis immediately. Older pastes that fall out of cache via LRU eviction are rarely accessed — they represent the long tail of pastes that were viewed once and forgotten.
Expired pastes are handled at three levels: (1) Redis TTL auto-evicts cache entries matching the paste's expiry timestamp, so most expired paste reads result in a cache miss. (2) PasteService checks the expires_at column on every read (both cache hit and DB fallback) as a safety net for clock skew between Redis and the application. (3) A background cleanup job runs nightly to DELETE expired rows from PostgreSQL, preventing storage bloat. If the cleanup job fails, expired pastes remain in PostgreSQL but are not served to users.
Upgrade to the Object Storage variant (S3 + DynamoDB) when: (1) paste sizes regularly exceed 8KB, causing PostgreSQL TOAST overhead and vacuum degradation; (2) traffic exceeds 5K RPS and CDN edge caching would significantly reduce origin load; (3) you need native TTL auto-deletion without relying on a background cleanup job; (4) storage costs for large pastes in PostgreSQL exceed S3 costs. For most Pastebin-style services with small-to-moderate paste sizes, the RDBMS+Cache approach is sufficient.
Auto-increment IDs allow enumeration attacks — an attacker could scrape all pastes by incrementing the ID from 1 to N. Hash-based IDs (e.g., SHA-256 of content) create collisions when two users paste identical content. UUID-based IDs encoded in Base62 are non-guessable (62^8 = 218 trillion combinations for 8-character URLs), collision-free, and URL-friendly. They also avoid hot-spot issues with sequential IDs in distributed databases.
With Redis unavailable, all reads fall through to PostgreSQL. At 4,500 peak reads/sec hitting the database directly, the system experiences the same bottleneck as the naive approach. The 2 read replicas help absorb some load, but latency spikes and the connection pool saturates quickly. Redis HA (2 nodes with automatic failover) mitigates this — a single node failure triggers failover in approximately 30 seconds, during which the surviving node handles traffic at reduced capacity.
Sign in to join the discussion.
Ready to design your own Pastebin?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator