The simplest Pastebin architecture: one service writing and reading every paste synchronously to a single PostgreSQL database. No cache, no CDN. Demonstrates why every-read-hits-the-DB fails at scale with a 10:1 read-to-write ratio.
Designing a Pastebin-like service is one of the most approachable system design interview questions, yet it covers fundamental concepts that apply to far more complex systems: unique ID generation, blob storage, TTL-based expiry, read-heavy traffic patterns, and the cache-versus-no-cache trade-off. The naive approach is the essential starting point because it establishes the baseline that makes the improvements in cached and object-storage architectures measurable and concrete.
The core challenge is deceptively simple: a user pastes text, the system generates a unique URL, and anyone with that URL can view the paste. Under the hood, the system must handle 1 million new pastes per day (approximately 12 writes per second average, 500 peak) and 10 million views per day (approximately 116 reads per second average, 4,500 peak). The 10:1 read-to-write ratio is the defining characteristic of this workload. Most pastes are small — 5KB average — but the system must support up to 10MB pastes for large code dumps or log files. Read latency must stay under 200ms at p99.
In the naive approach, every operation hits a single PostgreSQL database. Paste creation generates a UUID-based ID, inserts the content into a `pastes` table, and returns the URL. Paste viewing queries the same table by primary key — a B-tree indexed lookup completing in approximately 12ms under light load. There is no cache, no CDN, no background workers, and no separation between the read and write paths. The architecture has exactly four components: Client, Load Balancer, PasteService, and PasteDatabase.
The fatal flaw of this approach is the database becoming both the read and write bottleneck simultaneously. At 4,500 peak reads per second, each requiring a 12ms database query, the connection pool (100 connections) approaches saturation. Large pastes stored via PostgreSQL TOAST add vacuum overhead that further degrades performance. Without a cache to absorb the 90% of reads that are for recently created or popular pastes, every single view pays the full database round-trip cost.
This template makes the database bottleneck visible and quantifiable. Run the simulation at increasing RPS and watch PostgreSQL utilization climb past 90% while the PasteService sits idle at under 10% CPU. The comparison with the RDBMS+Cache variant demonstrates how a single Redis cache-aside layer — achieving a 90% hit rate — reduces database reads from 4,500/sec to just 450/sec, transforming an overloaded system into a comfortable one. Pastebin design appears in interviews at companies like GitHub, GitLab, Hastebin, and virtually any company that needs to reason about text storage, URL generation, and read-heavy access patterns.
The naive Pastebin system is a four-component linear architecture: Client, Load Balancer, PasteService, and PostgreSQL database. There is no cache, no CDN, no object storage, and no separation between the read and write paths. Every request follows the same path through all four components.
All traffic enters through the Load Balancer, which distributes requests across PasteService pods using round-robin. The Load Balancer adds approximately 1.5ms of routing latency and supports up to 5,000 RPS — well above the system's actual ceiling, which is determined entirely by the database. Both paste reads and paste creates flow through the same LB and service — there is no CQRS split, no API Gateway, and no authentication layer.
The PasteService is a stateless REST API running on 2 pods with 50 threads each, providing 100 concurrent request capacity. It handles two operations: (1) paste creation — validate content size (max 10MB), generate a UUID-based paste ID encoded in Base62 for URL friendliness, INSERT into the pastes table, return the paste URL; (2) paste read — SELECT from the pastes table by paste_id, check if the paste has expired (compare expires_at to current time), validate password if protected, return the content. Service processing time is approximately 8ms, but total response time is dominated by database latency.
PostgreSQL stores a single table: pastes. It contains both metadata (paste_id, syntax hint, expires_at, password_hash, created_at) and content (the paste text itself) in the same row. The paste_id column has a B-tree index enabling approximately 12ms indexed reads under light load. Large pastes exceeding 8KB are automatically handled by PostgreSQL TOAST (The Oversized Attribute Storage Technique), which compresses and stores them out-of-line. TOAST is transparent to the application but adds vacuuming overhead that degrades performance over time.
The system has no redundancy at the data layer. A single PostgreSQL primary handles all reads and writes with no read replicas. If the database fails, both reads and writes stop entirely. There is no failover, no replication, and no backup read path. The concrete scaling ceiling is approximately 500 sustained RPS — at this point, the 100-connection pool approaches saturation with 12ms read latency and 50ms write latency competing for the same connections. Beyond 500 RPS, queue depth increases, latency spikes, and the system degrades rapidly.
Choice
Every paste read hits PostgreSQL directly
Rationale
A Redis cache-aside layer with 90% hit rate would reduce DB reads from 4,500/sec to 450/sec — a 10x reduction. The naive approach skips caching for simplicity. This is the core insight: with a 10:1 read-to-write ratio, caching is not optional at moderate scale. The RDBMS+Cache variant adds this cache and immediately handles 5x more traffic.
Choice
One database for both content and metadata, no replicas
Rationale
A single PostgreSQL instance eliminates replication lag, split-brain, and failover complexity. ACID transactions ensure every paste creation is durable before returning a URL to the user. The cost is that all reads and writes compete for the same connection pool (100 connections), CPU, and disk I/O.
Choice
Paste text content stored alongside metadata in the pastes table
Rationale
At 5KB average paste size, storing content inline is simpler than a two-system approach (metadata DB + S3). PostgreSQL TOAST handles larger pastes transparently. The trade-off is that large pastes (>8KB) fragment across TOAST pages, slowing vacuum and sequential scans. The Object Storage variant separates content into S3 to avoid this.
Choice
All reads served from origin through the Load Balancer
Rationale
Most Pastebin pastes have a small audience (shared among a team or friends). The long-tail access pattern means CDN hit rate would be low for the average paste. However, viral pastes (error logs shared on social media) would benefit from edge caching — the Object Storage variant adds CloudFront for this scenario.
Choice
Base62-encoded UUIDs for URL-friendly, non-guessable paste identifiers
Rationale
Sequential IDs allow enumeration attacks — an attacker could scrape all pastes by incrementing the ID. UUIDs provide security by obscurity for unlisted pastes without requiring authentication for viewing. Base62 encoding keeps URLs short (e.g., 8 characters for 62^8 = 218 trillion unique IDs).
Target RPS
~500 sustained (DB ceiling)
Latency (p99)
12ms-200ms+ reads (varies with load)
Storage
~5 GB/day at 1M pastes/day
Availability
~99% (single DB, no redundancy)
| Operation | Time | Space | Notes |
|---|---|---|---|
| Create paste (POST /api/v1/pastes) | O(1) INSERT + O(log N) B-tree index update | O(1) per paste (~5KB avg, up to 10MB) | N = total pastes. Index update is logarithmic but fast. Write latency dominated by disk fsync (~50ms). |
| Read paste (GET /api/v1/pastes/{id}) | O(log N) B-tree index seek | O(1) per read (single row fetch) | ~12ms under light load. Degrades to 50ms+ as connection pool saturates. TOAST adds latency for pastes >8KB. |
Sole storage for all paste content and metadata. Every read and write hits this table — no cache buffer. B-tree index on paste_id for fast indexed lookups (~12ms). TOAST handles large pastes (>8KB) transparently but adds vacuuming overhead. Grows ~5GB/day before expiry cleanup.
Indexes: idx_pastes_pkey ON (paste_id) — B-tree, primary lookup path, idx_pastes_expires ON (expires_at) — for background cleanup queries
At 1M pastes/day x 5KB avg, table grows ~5GB/day. Without cleanup, this is ~1.8TB/year. TOAST pages for large pastes add ~20% overhead to vacuum operations.
| Variant | Tier | Latency | Throughput | Cost | Complexity | Reliability |
|---|---|---|---|---|---|---|
| Naive (Single Service + SQL) | T1 | 12ms-200ms+ reads | ~500 sustained RPS | $200/month (single DB + service) | Low — 4 components, no cache | 99% (single DB, no failover) |
| RDBMS with Cache (Postgres + Redis) | T2 | <5ms cache hit, 12ms miss | ~5K sustained RPS | $600/month (DB + Redis + service) | Medium — cache-aside pattern | 99.9% (Redis HA, DB replicas) |
| Object Storage + NoSQL TTL | T3 | <10ms CDN hit, ~80ms full path | 10K+ sustained RPS | $1,200/month (S3 + DynamoDB + CDN + Redis) | High — 7 components, two-system storage | 99.95% (S3 durability, DynamoDB HA) |
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
At 500 RPS with a 10:1 read-to-write ratio, the database handles ~450 reads/sec (each taking ~12ms) and ~50 writes/sec (each taking ~50ms). With 100 database connections, the connection pool occupancy is approximately (450 x 0.012) + (50 x 0.050) = 7.9 connections — seemingly fine. But at peak traffic (4,500 reads/sec), occupancy jumps to (4,050 x 0.012) + (450 x 0.050) = 71 connections, and latency variance pushes many queries past 50ms, saturating the pool. The system degrades non-linearly as queue depth builds.
Add a Redis cache-aside layer. With a 10:1 read-to-write ratio and Zipfian access pattern (recently created pastes are viewed most), a cache achieves approximately 90% hit rate. This reduces DB reads from 4,500/sec to ~450/sec — the single biggest improvement possible. The RDBMS+Cache variant implements exactly this, and it is the standard production approach for Pastebin-scale systems.
PostgreSQL TOAST transparently compresses and stores data exceeding 8KB in out-of-line TOAST pages. For a 10MB paste, this means the data is split across ~1,250 TOAST pages (8KB each). Reading such a paste requires fetching all pages — slower than a single S3 GET. More critically, TOAST pages create vacuum overhead: when expired pastes are deleted, the TOAST pages must be vacuumed separately, consuming CPU and I/O during maintenance windows.
Read replicas help with read scaling but introduce replication lag (typically 5-50ms for PostgreSQL streaming replication). For a Pastebin where users expect to see their paste immediately after creation, even small replication lag causes confusion — the user creates a paste, shares the URL, and the recipient gets a 404 because the replica has not caught up. A cache-aside pattern avoids this by writing to the cache on creation, ensuring immediate read-after-write consistency.
In the naive approach, expired pastes are checked at read time — PasteService compares expires_at to the current timestamp and returns 404 if expired. The expired row remains in PostgreSQL consuming storage. Without a periodic cleanup job, the table grows indefinitely at ~5GB/day. This is a deliberate trade-off for simplicity. The Object Storage variant solves this with DynamoDB native TTL, which auto-deletes expired items without application intervention.
Sign in to join the discussion.
Ready to design your own Pastebin?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator