Full production URL shortener with CDN-cached redirects (95% edge hit rate), GeoDNS routing, Redis cache cluster, ZooKeeper range allocation, sharded PostgreSQL, and async Kafka analytics. Handles 100K+ RPS.
The production multi-region URL shortener is the architecture that FAANG-level interviews expect candidates to arrive at through iterative refinement. Starting from the naive approach and working through the counter-based variant, each bottleneck drives the next architectural decision. The production variant addresses every remaining scaling limitation: geographic latency (CDN), counter contention (ZooKeeper range allocation), database write scaling (sharding), and analytics without impacting the redirect path (async Kafka pipeline).
The single most impactful optimization in this architecture is the CDN. URL redirect responses are perfectly cacheable — the mapping from short code to long URL is immutable. A CDN with 24-hour TTL deflects 95% of redirect reads at the edge with ~2ms latency, reducing origin traffic by 20x. At 100K total redirect RPS, only ~5K requests reach the origin infrastructure. This means the entire service tier, cache cluster, and database only need to handle 5K RPS — well within the capacity of the counter-based (v1) architecture. The CDN effectively decouples your scale ceiling from your infrastructure complexity.
The second key innovation is ZooKeeper-based counter range allocation. In the v1 variant, every URL creation requires a Redis INCR — a single-point coordination at 2K writes/sec. ZooKeeper range allocation gives each service pod a batch of 1,000 sequential IDs. The pod allocates from its local range without any network calls until the range is exhausted, then requests a new range from ZooKeeper. This reduces coordination traffic by 1,000x: instead of 2K INCR calls/sec, ZooKeeper handles only ~2 range requests/sec across 12 pods.
Database sharding becomes necessary at 10K writes/sec. A single PostgreSQL instance saturates at ~5K durable writes/sec. Consistent hash sharding by short_code distributes writes across 4 primary shards, each handling ~2.5K writes/sec. Read replicas absorb cache-miss reads without competing with the write workload. The sharding is straightforward because URL shortener queries are exclusively point lookups by short_code — no cross-shard joins or range scans.
The async Kafka analytics pipeline tracks click events (redirect counts, referrer data, geographic distribution) without adding latency to the redirect path. Every redirect produces a Kafka event asynchronously; an analytics worker consumes events in batches and writes aggregated click statistics to a separate analytics database. If the analytics pipeline falls behind or fails entirely, redirects continue unaffected.
This template demonstrates the production system design interview in full: CDN strategy, distributed counter allocation, database sharding, async event pipelines, circuit breakers, and multi-region failover. The comparison with simpler variants quantifies the trade-offs: 11 components vs. 3, $3,000/month vs. $180/month, but 100K+ RPS vs. 500 RPS and 99.99% vs. 99% availability.
The production URL shortener uses 11 components organized into four tiers: edge layer (CDN, GeoDNS, API Gateway), application layer (API service cluster), data layer (Redis cache cluster, ZooKeeper, sharded PostgreSQL, read replicas), and async analytics (Kafka, analytics worker, analytics database).
All traffic enters through the CDN (CloudFront), which caches HTTP 301 redirect responses with a 24-hour TTL. At 95% cache hit rate, most redirects are served at the nearest edge location in ~2ms — the user never reaches the origin infrastructure. Only CDN cache misses (new URLs, cold URLs, expired TTL) flow to the GeoDNS load balancer. GeoDNS (Route 53 latency-based routing) directs traffic to the nearest regional API Gateway, providing geographic load distribution and health-check-based failover.
The API Gateway handles JWT authentication (~3ms), rate limiting (50K RPS cap on origin traffic), and request validation. It routes all /api/v1/* traffic to the API service cluster — 12 ECS Fargate pods (4 vCPU, 8 GB each) with 100 threads per pod, providing 240K sustained RPS capacity. At ~10K effective origin RPS (5K reads + 5K writes), the service tier operates at under 5% utilization.
For redirect reads (CDN miss path), the API service checks the Redis cache cluster first (3-node ElastiCache, 78 GB total). At ~95% cache hit rate of origin traffic, most CDN-miss redirects are still served from Redis in ~2ms. On cache miss, the service reads from a PostgreSQL read replica (~10ms indexed read) and populates the cache for future requests.
For URL creation, each service pod allocates counter ranges from ZooKeeper (3-node ensemble). A range of 1,000 sequential IDs allows the pod to generate 1,000 short codes locally without any network coordination. When the range is exhausted (every ~100 seconds at 10 creates/sec per pod), the pod requests a new range from ZooKeeper (~8ms). Each ID is Base62-encoded into a 7-character short code. The mapping is written to the sharded PostgreSQL primary (4 shards by consistent hash of short_code) and to the Redis cache cluster simultaneously.
The async analytics pipeline operates independently of the redirect path. After every redirect, the API service asynchronously produces a click event to a Kafka topic (MSK, partitioned by short_code). An analytics worker (4 ECS pods) consumes events in batches using 1-minute tumbling windows and writes aggregated click counts to a dedicated analytics database. This separation ensures that click tracking adds zero latency to the redirect path — even if the analytics pipeline fails, redirects continue unaffected.
Circuit breakers, timeouts, and retries protect all downstream connections. Service-to-database edges have 500ms timeouts, 1 retry, and circuit breaker enabled. If a database shard becomes unresponsive, the circuit breaker opens within seconds, and the service returns a graceful error instead of hanging. The ZooKeeper connection has a 300ms timeout — if range allocation fails, the service can continue allocating from any remaining IDs in its current range.
The production architecture has three levels of caching: CDN at the edge (95% hit rate), Redis at the origin (95% of remaining), and PostgreSQL read replicas as the final fallback. This layered caching means only 0.25% of total redirect traffic actually reaches the database. For URL creation, ZooKeeper range allocation eliminates per-write coordination, and sharded PostgreSQL distributes writes across 4 instances. The async Kafka pipeline tracks click analytics without adding latency to the redirect path.
Step-by-Step Walkthrough
Pseudocode
// REDIRECT — Three-level caching
// Level 1: CDN (95% of all reads, ~2ms)
// Level 2: Redis (95% of CDN misses, ~2ms at origin)
// Level 3: Read Replica (DB fallback, ~10ms)
async function redirect(short_code):
// CDN handles Level 1 transparently (not in app code)
// Level 2: Redis cache
cached = await redis_cluster.get("url:" + short_code) // ~2ms
if cached:
kafka.produce_async("url_clicks", short_code, click_event) // Fire-and-forget
return 301, Location: cached
// Level 3: Read Replica (cache miss)
shard = consistent_hash(short_code) % NUM_SHARDS
row = await read_replica[shard].execute(
"SELECT original_url FROM urls WHERE short_code = $1",
[short_code]
) // ~10ms
if not row: return 404
await redis_cluster.setex("url:" + short_code, 86400, row.original_url)
kafka.produce_async("url_clicks", short_code, click_event)
return 301, Location: row.original_url
// URL CREATION — Range-based counter + sharded write
local_range = null // { start, end, current }
async function createShortUrl(long_url):
if local_range is null or local_range.current > local_range.end:
local_range = await zookeeper.allocate_range(1000) // ~8ms, every 1000 URLs
counter = local_range.current++ // Local, no network call
short_code = base62_encode(counter)
shard = consistent_hash(short_code) % NUM_SHARDS
await shard_db[shard].execute(
"INSERT INTO urls (short_code, original_url, shard_key, created_at)
VALUES ($1, $2, $3, now())",
[short_code, long_url, shard]
) // ~50ms
await redis_cluster.setex("url:" + short_code, 86400, long_url) // ~2ms
return BASE_URL + "/" + short_codeThe production variant uses five storage systems, each optimized for a specific access pattern. The CDN stores cached HTTP responses at edge locations. Redis stores URL mappings for fast origin reads. ZooKeeper manages counter range allocation. Sharded PostgreSQL provides durable write storage. The analytics DB stores aggregated click statistics. This separation ensures that no single storage system is a bottleneck and failures in one tier do not cascade to others.
Step-by-Step Walkthrough
Choice
CloudFront with 24-hour TTL on 301 responses
Rationale
URL redirect mappings are immutable — the short code always resolves to the same long URL. CDN caching deflects 95% of redirect reads at the edge with ~2ms latency, reducing origin traffic by 20x. This is the single highest-impact optimization: it decouples your scale ceiling from your origin infrastructure capacity.
Choice
Batch ID ranges (1,000 IDs per allocation) instead of per-write INCR
Rationale
Redis INCR is a per-write network call, creating contention at 10K+ writes/sec. ZooKeeper range allocation amortizes the coordination cost: one range request per 1,000 URL creations. Each pod allocates IDs locally without network calls. This reduces coordination traffic by 1,000x while preserving globally unique, sequential IDs.
Choice
4 PostgreSQL shards by consistent hash of short_code
Rationale
A single PostgreSQL instance saturates at ~5K durable writes/sec. Sharding distributes writes across 4 instances, each handling ~2.5K writes/sec. URL shortener queries are exclusively point lookups by short_code, making consistent hash sharding trivial — no cross-shard joins or range queries needed.
Choice
Dedicated replicas for cache-miss reads
Rationale
Read replicas absorb the ~250 cache-miss reads/sec without competing with the write workload on primaries. During viral URL traffic spikes that overwhelm the cache, read replicas prevent write latency degradation on the primary shards.
Choice
Fire-and-forget click events to Kafka, batch processing by worker
Rationale
Click analytics adds valuable data (per-URL click counts, referrer tracking, geographic distribution) but is not latency-critical. A synchronous DB write on the redirect path would double latency. Kafka absorbs events in ~5ms asynchronously, and the analytics worker processes them in batches without impacting redirects.
Choice
Timeout + retry + circuit breaker on DB and ZK connections
Rationale
At production scale, partial failures are inevitable. A database shard going slow should not cascade to the entire service. Circuit breakers detect failure patterns (3 failures in 10 seconds) and open the circuit, failing fast instead of queuing requests behind a slow dependency.
Target RPS
100K+ total (5K origin after CDN)
Latency (p99)
~2ms CDN hit, ~25ms cache hit, ~120ms creation
Storage
~500 GB/year across 4 shards
Availability
99.99% (multi-AZ, circuit breakers, CDN)
| Operation | Time | Space | Notes |
|---|---|---|---|
| Redirect (CDN hit, 95%) | O(1) — CDN edge lookup | O(1) | ~2ms. No origin infrastructure involved. CDN serves from edge cache. |
| Redirect (cache hit, ~4.75%) | O(1) Redis GET | O(1) | ~25ms end-to-end (CDN miss + network + Gateway + LB + Service + Redis). |
| Redirect (cache miss, ~0.25%) | O(log n) PostgreSQL SELECT on read replica | O(1) | ~40ms end-to-end. Cache populated on miss for subsequent reads. |
| Create short URL | O(1) local counter alloc + O(log n) DB INSERT on shard | O(1) per URL | ~120ms. ZooKeeper range request amortized over 1,000 creations. |
Primary URL mapping table, sharded across 4 PostgreSQL instances by consistent hash of short_code. Each shard holds ~25% of all URL mappings. Writes go to the shard that owns the short_code's hash range. At 10K writes/sec total, each shard handles ~2.5K writes/sec — well within a single PostgreSQL instance's capacity.
Indexes: PK B-tree on short_code, idx_shard_key ON shard_key
Consistent hash sharding with virtual nodes. Adding a 5th shard remaps ~25% of keys. No cross-shard queries needed — all access is by short_code point lookup.
Tracks allocated counter ranges for each service pod. Each range contains 1,000 sequential IDs. When a pod exhausts its range, it requests a new one from ZooKeeper. The ZK ensemble ensures ranges never overlap, providing globally unique sequential IDs without per-write coordination.
ZooKeeper coordination: ~12 range requests/sec across 12 pods (1 request per 1,000 URL creations per pod). 1000x less coordination than Redis INCR.
Aggregated click analytics per URL per hour. Written by the analytics worker from Kafka click events using 1-minute tumbling windows (then rolled up to hourly). Queried by analytics dashboards. Separate from the URL database to prevent analytics writes from affecting redirect performance.
Indexes: PK on (short_code, hour_bucket), idx_hour DESC
Write volume: 1 row per URL per hour (vs 1 row per click in a naive approach). At 100K URLs with clicks per day, ~100K rows/day.
3-node Redis cluster caching URL mappings. Key: url:{short_code}, Value: original_url. 24-hour TTL with LRU eviction. 95% hit rate for origin redirect traffic. Immutable mappings mean no cache invalidation needed. The cache cluster also stores recently created URLs for immediate read-after-write consistency.
78 GB total across 3 nodes. Working set: ~10 GB for 10M active URLs at ~1KB each. Hit rate: ~95% of origin traffic (after CDN).
Produced by the API service on every redirect. Contains the short code, referrer, user agent, hashed IP, and timestamp. Consumed by the analytics worker for click count aggregation. Fire-and-forget semantics — dropped events do not affect redirect availability.
Key Schema
short_code (string) — partitioned by short_code for per-URL ordering
Value Schema
{ "short_code": "string", "referrer": "string", "user_agent": "string", "ip_hash": "string", "timestamp": "ISO8601" }
A URL goes viral: 500K redirect RPS to a single short code
Impact
CDN absorbs 95% (475K RPS) at edge. The remaining 25K RPS hit Redis cache cluster. Redis can handle 100K+ ops/sec, so no origin database impact. End users experience ~2ms latency regardless of traffic volume.
Mitigation
CDN is the primary defense. If CDN TTL expires during the spike, a brief burst of origin traffic hits Redis (not DB). The CDN automatically repopulates from the origin response.
One of 4 database shards goes down
Impact
~25% of URL creations fail (writes to that shard). ~25% of cache-miss reads fail. Circuit breaker opens within seconds. 75% of the system continues functioning normally. Redirects for cached URLs (99.75% of reads) are unaffected.
Mitigation
RDS Multi-AZ automatic failover (30-60 seconds). Circuit breaker prevents request queuing during failover. The CDN and cache continue serving the vast majority of reads.
ZooKeeper ensemble loses quorum (2 of 3 nodes down)
Impact
New range allocations fail. Each pod continues using its current range — ~100 seconds of runway per pod. After ranges are exhausted, URL creation fails. Redirects are completely unaffected.
Mitigation
ZooKeeper nodes in different AZs. Automatic recovery when a node rejoins. Alert on ZK health. Fallback to UUID-based IDs if ZK is unavailable for more than 60 seconds.
Kafka analytics pipeline falls behind by 10 minutes
Impact
Click analytics dashboards show stale data (10-minute lag). Zero impact on redirect performance — the analytics pipeline is completely decoupled. No data loss as Kafka retains events for 7 days.
Mitigation
Scale up analytics worker pods. Monitor consumer lag. Kafka's retention ensures events are not lost during processing delays.
CDN configuration error causes 0% cache hit rate
Impact
All 100K redirect RPS hit the origin. The service tier (240K RPS capacity) handles it, but Redis cache cluster and database see 20x normal load. p99 latency increases from 2ms to 25ms. Cost increases due to origin bandwidth.
Mitigation
Monitor CDN hit rate — alert if it drops below 80%. CDN configuration should be immutable infrastructure (Terraform/CDK) with rollback capability. Test CDN behavior in staging before production changes.
| Component | Failure | Impact | Mitigation |
|---|---|---|---|
| CDN (CloudFront) | Edge location outage | Traffic from affected regions bypasses CDN edge and goes directly to origin. Increased origin load. Higher latency for affected users. | CloudFront automatically reroutes to the next nearest edge location. Origin infrastructure is sized to handle CDN-miss traffic spikes. |
| Redis Cache Cluster | Node failure in 3-node cluster | Cluster redistributes slots to remaining 2 nodes (~30 seconds). During redistribution, some cache reads fail and fall through to DB replicas. Cache hit rate temporarily drops. | Redis Cluster automatic failover. Read replicas absorb increased DB load during redistribution. Alert on cluster health. |
| ZooKeeper | Ensemble loses quorum | Range allocation fails. Pods use remaining IDs in current range (~100 seconds runway). URL creation fails after ranges exhausted. Redirects unaffected. | ZK nodes across 3 AZs. Automatic rejoin on recovery. Fallback to UUID generation if ZK unavailable > 60 seconds. |
| Shard DB Primary | Single shard unresponsive | 25% of writes and cache-miss reads fail. Circuit breaker opens within 3 seconds. Other 3 shards continue normally. | RDS Multi-AZ failover (30-60s). Circuit breaker prevents cascading failure. CDN and cache serve 99.75% of reads during outage. |
| Kafka (Click Stream) | Broker failure | Click events fail to produce. Analytics data has gaps. Zero impact on redirect performance. | MSK 3-broker cluster with replication factor 3. Automatic leader election. Click events are best-effort — acceptable to lose a small percentage. |
Scaling is independent per tier. CDN: scales automatically (CloudFront is a managed service). Service tier: ECS auto-scaling based on CPU (target 60%) or ALB request count. Redis: vertical scaling (larger node type) or horizontal (add cluster nodes). Database: add read replicas for read scaling, add shards for write scaling (requires remapping ~1/N keys per new shard). ZooKeeper: generally static (3-5 nodes). Kafka: add partitions for throughput, add brokers for storage. The CDN is the first line of defense — increasing CDN hit rate from 95% to 98% reduces origin traffic by 60%. For extreme scale (1M+ RPS), add regional CDN origins and multi-region database replication.
Production monitoring spans 11 components across four tiers. CDN tier: cache hit ratio (target >90%, alert <85%), origin request rate, edge error rate, bandwidth costs. Service tier: pod CPU/memory utilization, thread pool saturation, request queue depth, p99 latency per endpoint. Cache tier: Redis cluster hit rate, memory utilization per node, eviction rate, replication lag. Database tier: per-shard connection count, query latency p99, replication lag to replicas, disk I/O utilization, WAL write throughput. ZooKeeper: session count, outstanding requests, latency, quorum health. Kafka: producer error rate, consumer lag, partition count, broker disk utilization. Analytics: worker processing rate, aggregation lag, DB write throughput. The critical path dashboard correlates CDN hit rate -> origin RPS -> cache hit rate -> DB utilization to identify cascading degradation. Alert on any metric that indicates the next tier is approaching capacity.
Monthly cost at 100K RPS total: CloudFront CDN (~$500 for 260B requests/month + bandwidth), Route 53 (~$50), API Gateway (~$175 for origin traffic), 12x ECS Fargate pods (4 vCPU, 8 GB, ~$1,440), ElastiCache 3-node cluster (~$600), ZooKeeper 3-node ensemble (~$300), 4x RDS PostgreSQL primary shards (~$960), 2x RDS read replicas (~$240), MSK Kafka 3-broker cluster (~$400), 4x analytics worker pods (~$240), analytics DB (~$120). Total: approximately $3,000-3,500/month. Cost per 1M requests: ~$0.04 — 15x cheaper per request than the naive variant at its capacity ceiling. The CDN is the most cost-effective component: $500/month to serve 95% of all traffic. Without CDN, you would need 20x more origin infrastructure (~$50K/month).
Multi-layer security: CDN provides DDoS protection via AWS Shield Standard (free) and WAF integration for geo-blocking and IP reputation filtering. API Gateway enforces JWT authentication, rate limiting per API key, and request validation. Internal services communicate over private subnets with security groups — no public internet exposure. Database credentials in AWS Secrets Manager with automatic rotation. Redis in private subnets with encryption in transit (TLS) and at rest. ZooKeeper in private subnets with SASL authentication. Kafka messages encrypted in transit. URL validation at the service layer blocks known malicious domains and open redirector patterns. Base62 short codes from counter ranges are not guessable (unlike naive sequential), but the range allocation pattern could be inferred — apply XOR mask for additional obscurity.
Blue/green deployment using ECS service with two target groups behind the ALB. New version deploys to the green target group while blue continues serving traffic. After health checks pass, the ALB switches traffic to green. Rollback: switch back to blue target group (instant, no redeployment). Database migrations use a forward-compatible strategy: new code is deployed to read both old and new schema, then the migration runs, then the old-schema code path is removed in a subsequent deployment. CDN configuration changes propagate in 5-15 minutes — use CloudFront invalidation for urgent cache busting. ZooKeeper and Kafka are infrastructure components deployed separately from the application — they run on dedicated clusters managed by AWS (MSK).
| Variant | Tier | Latency | Throughput | Cost | Complexity | Reliability |
|---|---|---|---|---|---|---|
| Naive (Single Server) | T1 | ~50ms p99 | ~500 RPS | ~$180/mo | 3 components | ~99% (single pod) |
| Counter-Based (Base62) | T2 | <100ms p99 | ~20K RPS | ~$400/mo | 6 components | ~99.9% |
| Production Multi-Region | T3 | ~2ms CDN hit | 100K+ RPS | ~$3,000/mo | 11 components | ~99.99% |
| Serverless (Lambda + DynamoDB) | T4 | <30ms warm | 10K+ RPS (auto) | $0-800/mo | 4 components | ~99.99% |
This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.
URL redirect responses are perfectly cacheable: the mapping is immutable and the response is a simple HTTP 301. At 95% CDN hit rate, 100K read RPS becomes 5K origin RPS. The CDN serves the other 95K RPS at edge locations in ~2ms with no origin infrastructure involved. This is a 20x reduction in the infrastructure you need to build and operate.
Each service pod contacts ZooKeeper to claim a range of 1,000 sequential IDs (e.g., 5000000-5000999). The pod then assigns IDs from this range locally without any network calls. When the range is exhausted (~100 seconds at 10 creates/sec), the pod requests a new range. ZooKeeper ensures ranges never overlap. This reduces coordination from 10K network calls/sec (INCR) to ~12 calls/sec (range requests across 12 pods).
If ZooKeeper goes down, pods cannot allocate new ranges. However, each pod can continue generating URLs from its current range until it is exhausted. With 1,000 IDs per range and 10 creates/sec per pod, each pod has ~100 seconds of runway before URL creation fails. ZooKeeper outages longer than 2 minutes require the ZK ensemble to be recovered.
The short_code is hashed (e.g., MD5 or MurmurHash) to produce a value in [0, 2^32). This value is mapped to one of 4 shards using consistent hashing (virtual nodes on a hash ring). Each URL is stored on exactly one shard. Reads and writes for the same short_code always go to the same shard. Adding a 5th shard requires remapping ~25% of keys.
If your total traffic is under 20K RPS, the Counter-Based (v1) variant handles it without CDN, sharding, ZooKeeper, or Kafka. The production variant costs ~$3,000/month vs. ~$400/month for v1. The complexity of 11 components requires a dedicated ops team. Use v3 when you genuinely need 100K+ RPS, multi-region availability, or async analytics at scale.
The API service produces Kafka events asynchronously (fire-and-forget with acks=1). The Kafka produce call takes ~5ms but runs in a background thread — it does not block the redirect response. If Kafka is slow or down, the click event is dropped silently, and the redirect succeeds normally. Analytics accuracy can tolerate small gaps; redirect availability cannot.
Sign in to join the discussion.
Ready to design your own TinyURL?
Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.
Open Simulator