Easy6 componentsInterview: High

TinyURL — Counter-Based (Base62)

Q: Why does adding a cache reduce DB utilization so dramatically?

URL shorteners have a 100:1 read-to-write ratio. At 10K redirect RPS, 95% cache hit rate means Redis serves 9,500 reads/sec while PostgreSQL handles only 500 reads/sec. The database goes from handling 100% of reads (95% utilization) to handling 5% of reads (15% utilization). The cache absorbs the vast majority of the read workload.

Q: Is the sequential counter a security risk?

Sequential IDs leak information: observing short codes 'abc' and 'def' reveals how many URLs were created between them. For most URL shorteners this is acceptable (Bitly uses sequential IDs). For privacy-sensitive applications, apply a reversible bit-shuffle: XOR the counter value with a secret key before Base62 encoding. The codes appear random but are still deterministically generated from the counter.

Q: What happens if Redis goes down?

If Redis fails, two things break: (1) new URL creation fails because the counter is unavailable, and (2) redirect reads fall back to PostgreSQL for every request. The system degrades to naive-variant behavior until Redis recovers. Mitigation: Redis Cluster with automatic failover (2-3 second switchover), or a fallback to UUID-based IDs during Redis outage.

Q: Why Base62 instead of Base64?

Base62 uses only alphanumeric characters (0-9, a-z, A-Z). Base64 adds '+' and '/' which require URL encoding (becoming '%2B' and '%2F'), making short URLs longer and less readable. Base62 produces clean URLs that work in any context — emails, SMS, social media — without encoding issues.

Q: At what scale does this architecture need upgrading?

The counter-based approach handles up to ~20K RPS comfortably. Beyond that, two bottlenecks emerge: (1) the atomic counter becomes contention-hot at 10K+ writes/sec (needs range-based allocation as in v3), and (2) geographic latency becomes a factor for global users (needs CDN as in v3). Below 20K RPS, this is the right architecture — simple, well-understood, and cost-effective.

Industry-standard URL shortener using Redis cache-aside for redirect lookups (95% hit rate), an atomic counter for collision-free ID generation, and Base62 encoding. Handles 20K RPS with 6 pods.

StorageCachingCounter-Based ID

Try in Simulator

Problem Statement

The counter-based URL shortener is the standard production answer to the TinyURL interview question. It addresses every bottleneck identified in the naive variant: a Redis cache absorbs 95% of redirect reads, an atomic counter eliminates collision risk during ID generation, a load balancer distributes traffic across multiple service pods, and Base62 encoding produces compact 7-character short codes from sequential integers.

The core insight is the 100:1 read-to-write ratio. URL shorteners are massively read-heavy — for every URL created, it is redirected through 100 or more times. The naive variant forces every one of those reads through PostgreSQL. Adding a Redis cache changes the economics dramatically: at a 95% cache hit rate, only 5% of reads reach the database. At 10K redirect RPS, that means Redis handles 9,500 reads/sec (~2ms each) while PostgreSQL handles only 500 reads/sec (~10ms each). Database utilization drops from 95% to under 15%.

ID generation uses Redis INCR to maintain a global atomic counter. Each new URL gets the next integer (e.g., 1000001), which is Base62-encoded to produce a 7-character short code (e.g., "4c92"→"4c92" in Base62). This guarantees uniqueness without collision checks — every integer maps to a unique Base62 string, and INCR is atomic. The trade-off is that sequential IDs are guessable: observing two short codes reveals the total URL count between their creation times. For most use cases this is acceptable; for privacy-sensitive applications, a bit-shuffle or XOR mask can obscure the sequence.

This template introduces the API Gateway and Load Balancer layers. The API Gateway handles JWT authentication and rate limiting, protecting the service tier from abuse. The ALB distributes traffic across 6 UrlService pods using round-robin, providing horizontal scalability and redundancy — losing one pod reduces capacity by 17% instead of 100%.

The comparison with the naive variant is instructive. Same workload, dramatically different results: p99 redirect latency drops from 50ms (DB-bound) to 12ms (cache-bound), maximum throughput increases from 500 RPS to 20K+ RPS, and database utilization drops from 95% to 15%. The only additions are a cache, a counter, a load balancer, and a gateway — each solving a specific bottleneck identified in the naive approach.

Architecture Overview

The counter-based URL shortener adds three components to the naive architecture: an API Gateway for authentication and rate limiting, a Load Balancer for traffic distribution, and a Redis Cache that serves dual duty as both the read cache and the atomic counter for ID generation.

All traffic enters through the API Gateway (Amazon API Gateway REST), which validates JWT tokens (~3ms), enforces rate limits (30K RPS cap with headroom), and routes requests to the ALB. The Gateway adds security without burdening the service tier. The ALB distributes requests across 6 UrlService pods using round-robin — stateless routing since no session affinity is needed.

The UrlService (6 pods, 100 threads each) handles both URL creation and redirect logic. For URL creation (POST /api/v1/shorten), the service calls Redis INCR on a global counter key to get the next sequential integer, Base62-encodes it into a 7-character short code, writes the mapping to PostgreSQL (durable storage), and writes it to Redis (cache population). For redirect reads (GET /api/v1/redirect/{code}), the service checks Redis first — at a 95% hit rate, most redirects are served from cache in ~2ms. On cache miss (~5% of reads), the service falls back to PostgreSQL (~10ms indexed read) and writes the result back to Redis for future requests.

Redis serves dual purpose: (1) the INCR command maintains the global atomic counter for ID generation — O(1), sub-millisecond, and guaranteed unique; (2) the cache-aside pattern stores URL mappings (key: url:{short_code}, value: original_url) with 24-hour TTL and LRU eviction. URL mappings are immutable after creation, making cache invalidation trivial — there is none. The 95% hit rate emerges naturally from the Zipfian access pattern: a small fraction of URLs (viral links) receive the vast majority of redirect traffic, and these stay hot in cache.

PostgreSQL serves as the durable store with 2 read replicas. At 95% cache hit rate, only ~500 reads/sec reach the database (5% of 10K redirect RPS), plus ~2K writes/sec for URL creation. The database operates comfortably at 15-20% utilization — a dramatic improvement from the naive variant's 95%.

The architecture handles 20K+ RPS at peak with comfortable headroom at every layer. The service tier (6 pods x 100 threads = 600 concurrent slots at 5ms processing time) sustains ~120K RPS. The ALB is rated for 50K RPS. Redis handles 100K+ operations/sec. The bottleneck at extreme scale is the atomic counter — Redis INCR is a single-point contention at 2K writes/sec, well within Redis capacity but would require counter sharding at 100K+ writes/sec.

Architecture Preview

Loading architecture preview...

Open in Simulator

Request Flow — Cache-Aside with Counter-Based ID

The cache-aside pattern fundamentally changes the read path economics. Instead of every redirect hitting PostgreSQL (naive variant), 95% of reads are served from Redis in ~2ms. The counter-based ID generation eliminates collision risk entirely — Redis INCR returns the next unique integer atomically, and Base62 encoding produces a compact short code. The write path populates both the database (durable storage) and cache (fast reads) simultaneously.

Loading diagram...

Step-by-Step Walkthrough

1URL creation: Browser sends POST /shorten through API Gateway (JWT validation, rate limiting) and ALB (round-robin to one of 6 pods)
2UrlService calls Redis INCR on the global counter — returns the next unique integer in ~1ms
3UrlService Base62-encodes the integer to produce a 7-character short code
4UrlService INSERTs the mapping into PostgreSQL (durable storage, ~50ms)
5UrlService SETs the mapping in Redis (cache population, ~2ms) — subsequent reads will hit cache
6Redirect (cache hit, 95%): UrlService GETs the mapping from Redis — returned in ~2ms. Total request: ~12ms
7Redirect (cache miss, 5%): Redis returns MISS (~5ms). UrlService falls back to PostgreSQL SELECT (~10ms), then writes the result to Redis for future reads
8The 95% cache hit rate emerges from Zipfian access: viral/popular URLs stay hot in cache, while cold/old URLs expire via LRU eviction

Pseudocode

// URL Creation — counter-based, zero collision risk
async function createShortUrl(long_url):
    counter = await redis.incr("url_counter")    // Atomic, O(1), ~1ms
    short_code = base62_encode(counter)           // Deterministic, O(1)

    await db.execute(
        "INSERT INTO urls (short_code, original_url, created_at)
         VALUES ($1, $2, now())",
        [short_code, long_url]
    )   // ~50ms durable write

    await redis.setex("url:" + short_code, 86400, long_url)  // Cache populate, ~2ms
    return BASE_URL + "/" + short_code

// Redirect — cache-aside pattern
async function redirect(short_code):
    // Check cache first (95% hit rate)
    cached = await redis.get("url:" + short_code)   // ~2ms
    if cached: return 301, Location: cached

    // Cache miss — fall back to DB
    row = await db.execute(
        "SELECT original_url FROM urls WHERE short_code = $1",
        [short_code]
    )   // ~10ms
    if not row: return 404

    // Populate cache for future reads
    await redis.setex("url:" + short_code, 86400, row.original_url)
    return 301, Location: row.original_url

Storage Architecture

The storage layer has two tiers: Redis for fast reads and counter state, PostgreSQL for durable persistence. Redis stores URL mappings as simple key-value pairs with 24-hour TTL, plus the global counter for ID generation. PostgreSQL stores the same mappings durably with B-tree indexing. The cache-aside pattern means Redis is populated on writes (cache-through) and on cache-miss reads (cache-populate).

Loading diagram...

Step-by-Step Walkthrough

1Redis counter (url_counter): a single integer that is atomically incremented via INCR for each new URL. The value is Base62-encoded to produce the short code. No collision risk by construction.
2Redis cache (url:{short_code}): stores the original_url for fast redirect lookups. 24-hour TTL with LRU eviction. Populated on URL creation (write-through) and on cache-miss reads (cache-populate).
3PostgreSQL urls table: the durable source of truth. Same schema as the naive variant. Primary + 2 read replicas. At 95% cache hit rate, replicas handle only ~500 reads/sec instead of 10K.
4The relationship between counter and urls is one-to-many: each counter increment produces exactly one URL mapping. The relationship between urls and cache is one-to-one: each URL may have a cached copy in Redis.

Key Design Decisions

ID Generation Strategy

Choice

Redis INCR + Base62 encoding

Rationale

Counter-based IDs guarantee uniqueness without collision checks. Redis INCR is atomic and O(1) — each URL gets the next integer, which is Base62-encoded into a 7-character short code. 62^7 = 3.5 trillion unique codes, sufficient for decades of operation. The trade-off is sequential IDs are guessable, leaking creation volume. For privacy, apply a bit-shuffle or XOR mask.

Cache Strategy

Choice

Redis cache-aside with 24h TTL, LRU eviction

Rationale

URL mappings are immutable — a short code always maps to the same long URL. This makes cache invalidation trivial (no invalidation needed). Cache-aside with LRU naturally keeps hot URLs in cache while cold URLs expire. 95% hit rate drops database utilization from 95% to 15%.

Dual-Use Redis

Choice

Same Redis cluster for counter + cache

Rationale

Redis INCR for the counter and GET/SET for cache lookups use the same Redis cluster, eliminating one component. Both operations are sub-millisecond. Operational simplicity outweighs the minor risk of a single Redis failure affecting both functions.

Load Balancer + Multiple Pods

Choice

ALB with 6 UrlService pods (round-robin)

Rationale

6 pods provide both horizontal throughput (120K sustained RPS) and redundancy (losing 1 pod = 17% capacity reduction). Round-robin is sufficient since the service is stateless. No sticky sessions needed because every pod can serve any request.

API Gateway for Security

Choice

Amazon API Gateway with JWT auth + rate limiting

Rationale

Centralizes authentication and rate limiting at the edge, preventing abuse (bulk URL creation, redirect scraping) before requests reach the service tier. Adds ~3ms latency but eliminates the need for auth logic in every service pod.

Scale & Performance

Target RPS

~20K sustained

Latency (p99)

<100ms p99 (12ms cache hit)

Storage

~50 GB/year at 100K URLs/day

Availability

~99.9% (6 pods, Redis replication)

Time & Space Complexity

Operation	Time	Space	Notes
Create short URL	O(1) Redis INCR + O(1) Base62 encode + O(log n) DB INSERT	O(1) per URL — one DB row + one cache entry	INCR is O(1) and atomic. Base62 encoding of a 64-bit integer is O(1). DB INSERT is dominated by B-tree index update.
Redirect (cache hit, 95%)	O(1) Redis GET	O(1)	Sub-2ms. Hash table lookup in Redis. No database involvement.
Redirect (cache miss, 5%)	O(log n) PostgreSQL SELECT + O(1) Redis SET	O(1)	B-tree index lookup + cache population. ~15ms total.

Database Schema (HLD)

urls (PostgreSQL)

Durable store for all URL mappings. Same schema as the naive variant but with 2 read replicas. At 95% cache hit rate, the database handles only 5% of reads (~500 RPS at 10K total) plus all writes (~2K RPS). Primary key on short_code with B-tree index for fast cache-miss lookups.

short_code VARCHAR(7) PKoriginal_url TEXT NOT NULLcreated_at TIMESTAMPTZ NOT NULL DEFAULT now()expires_at TIMESTAMPTZ

Indexes: PK B-tree on short_code

2 read replicas absorb cache-miss reads. Replication lag ~5ms. Write path goes to primary only.

redis cache (url:{short_code})

In-memory URL mapping cache in Redis. Key pattern: url:{short_code}. Stores the original_url with 24-hour TTL. 95% hit rate from Zipfian access pattern — viral links stay hot in cache. Also stores the global counter key for INCR-based ID generation. LRU eviction ensures the cache stays within memory limits.

key: url:{short_code}value: original_url (string)TTL: 86400 seconds (24 hours)

Working set: ~1 GB for 1M active URLs at ~1KB per entry. Cache size: 950 MB. Hit rate: ~95% from Zipfian access.

What-If Scenarios

Redis cluster fails completely

Impact

URL creation fails (counter unavailable). Redirect reads degrade to database-only (naive variant behavior). p99 latency jumps from 12ms to 50ms+. DB utilization spikes from 15% to 95%.

Mitigation

Redis Cluster with automatic failover (2-3s switchover). Fallback to UUID-based IDs for creation during outage. Circuit breaker on cache calls to fail fast.

Counter reaches 62^7 (3.5 trillion)

Impact

Base62 encoding produces 8-character codes instead of 7. Short URLs become slightly longer but still functional. No data corruption or service failure.

Mitigation

Monitor counter value. At 100K URLs/day, reaching 3.5T takes ~96,000 years. This is a non-issue in practice.

Cache eviction storm from burst traffic pattern change

Impact

If traffic shifts from viral URLs (high cache hit) to long-tail URLs (low cache hit), cache hit rate drops from 95% to 50-60%. Database utilization spikes. p99 latency increases.

Mitigation

Monitor cache hit rate. If it drops below 80%, either increase cache size or add read replicas to handle the increased DB load.

One of 6 service pods enters a crash loop

Impact

Capacity drops by 17%. ALB health checks detect the failure in ~10 seconds and stop routing to the unhealthy pod. Remaining 5 pods handle the load with reduced headroom.

Mitigation

ECS auto-restart replaces the failed task. Auto-scaling adds a replacement pod within 2 minutes. ALB health checks prevent traffic from reaching unhealthy pods.

Failure Modes & Resilience

Component	Failure	Impact	Mitigation
Redis Cache	Node failure	If the primary Redis node fails, reads fall back to PostgreSQL (increased latency). Counter INCR fails, blocking URL creation. Automatic failover to replica takes 2-3 seconds.	Redis Cluster with 2 replicas per shard. Automatic failover. Application-level fallback to UUID generation during Redis outage.
API Gateway	Rate limit misconfiguration	If rate limit is set too low, legitimate traffic is rejected with 429 errors. If too high, abuse traffic reaches the service tier.	Set rate limit at 1.5x expected peak. Monitor 429 error rate. Use adaptive rate limiting that adjusts based on service health.
PostgreSQL	Replication lag spike	Read replicas serve stale data for a few seconds. A newly created URL might not be found on a cache-miss read that hits a lagging replica.	Read-after-write consistency: after URL creation, write to cache immediately. Subsequent reads hit cache (not replica). Replication lag is only visible for cache-miss reads of very recently created URLs.

Scaling Strategy

Horizontal scaling at the service tier: ECS auto-scaling adds pods based on CPU utilization (target: 60%) or ALB request count. Redis can be scaled vertically (larger node type) or horizontally (Redis Cluster with more shards). PostgreSQL scales reads via additional read replicas and writes via vertical scaling. The scaling ceiling is approximately 20K RPS, limited by: (1) the atomic counter's single-point contention at high write rates, and (2) the single Redis cluster's memory capacity for the URL cache. Beyond 20K RPS, upgrade to the Production variant (v3) with CDN, ZooKeeper range allocation, and sharded databases.

Monitoring & Alerting

Key metrics to monitor: Redis cache hit rate (target: >90%, alert at <85%), Redis memory utilization (alert at 80%), Redis INCR counter value (informational — track growth rate), PostgreSQL active connections (alert at 160/200), DB replication lag (alert at >10ms), ALB request count and error rate, API Gateway 4xx/5xx rates, service pod CPU and memory utilization, and end-to-end redirect latency p99. The most important single metric is cache hit rate — a drop from 95% to 80% doubles database load. Set up a correlation dashboard showing cache hit rate vs. DB utilization vs. redirect latency to catch cascading degradation early.

Cost Analysis

Monthly infrastructure cost at moderate traffic (10K RPS): API Gateway (~$35 for 26B requests/month), ALB (~$20 + $8/million requests), 6x ECS Fargate pods (2 vCPU, 4 GB each, ~$360/month), ElastiCache Redis (cache.r7g.large 2-node, ~$200/month), RDS PostgreSQL primary + 2 read replicas (~$360/month). Total: approximately $400-450/month. Compared to the naive variant at $180/month, you get 40x the throughput (20K vs 500 RPS) for 2.5x the cost. Cost per 1M requests: ~$0.60 (vs $12.00 for naive). This is the sweet spot for most production URL shorteners — simple enough to operate, cost-effective up to 20K RPS.

Security Considerations

The API Gateway adds JWT authentication, preventing anonymous abuse. Rate limiting (30K RPS cap) protects against DDoS and bulk URL creation. Sequential Base62 short codes are guessable — an attacker can enumerate URLs by incrementing the counter. Mitigations: (1) apply a bit-shuffle to the counter before Base62 encoding, (2) require authentication for URL creation, (3) implement URL scanning to block malicious redirect targets (phishing, malware). Redis should be deployed in a private subnet with security group rules limiting access to the service tier only. PostgreSQL credentials should be stored in AWS Secrets Manager and rotated on a schedule.

Deployment Strategy

Rolling deployment via ECS service update. The ALB drains connections from old tasks (30-second drain period) before terminating them. With 6 pods, a rolling update replaces one pod at a time, maintaining 83% capacity throughout. Blue/green deployment is supported via ALB target group switching for zero-downtime releases. Database migrations run before code deployment using a separate migration task. Redis cache is populated on demand (cache-aside), so no pre-warming is needed after deployment — the first requests after deploy will have cache misses that resolve on subsequent access.

Real-World Examples

•Bitly's core URL shortening service (counter-based IDs, Redis cache)
•TinyURL's production architecture (similar cache-aside pattern)
•Rebrandly and Short.io (ALB + multi-pod service + Redis)
•Internal link shorteners at mid-size companies (100-10K RPS)

Solution Comparison

Variant	Tier	Latency	Throughput	Cost	Complexity	Reliability
Naive (Single Server)	T1	~50ms p99	~500 RPS	~$180/mo	3 components	~99% (single pod)
Counter-Based (Base62)	T2	<100ms p99	~20K RPS	~$400/mo	6 components	~99.9%
Production Multi-Region	T3	~2ms CDN hit	100K+ RPS	~$3,000/mo	11 components	~99.99%
Serverless (Lambda + DynamoDB)	T4	<30ms warm	10K+ RPS (auto)	$0-800/mo	4 components	~99.99%

This template is for educational and illustration purposes only. It may not represent the optimal production design for this problem. Real-world systems involve additional considerations (compliance, specific cloud provider constraints, organizational requirements) not captured here. Use this as a starting point for discussion, not as a production blueprint.

Frequently Asked Questions

Why does adding a cache reduce DB utilization so dramatically?

URL shorteners have a 100:1 read-to-write ratio. At 10K redirect RPS, 95% cache hit rate means Redis serves 9,500 reads/sec while PostgreSQL handles only 500 reads/sec. The database goes from handling 100% of reads (95% utilization) to handling 5% of reads (15% utilization). The cache absorbs the vast majority of the read workload.

Is the sequential counter a security risk?

Sequential IDs leak information: observing short codes 'abc' and 'def' reveals how many URLs were created between them. For most URL shorteners this is acceptable (Bitly uses sequential IDs). For privacy-sensitive applications, apply a reversible bit-shuffle: XOR the counter value with a secret key before Base62 encoding. The codes appear random but are still deterministically generated from the counter.

What happens if Redis goes down?

If Redis fails, two things break: (1) new URL creation fails because the counter is unavailable, and (2) redirect reads fall back to PostgreSQL for every request. The system degrades to naive-variant behavior until Redis recovers. Mitigation: Redis Cluster with automatic failover (2-3 second switchover), or a fallback to UUID-based IDs during Redis outage.

Why Base62 instead of Base64?

Base62 uses only alphanumeric characters (0-9, a-z, A-Z). Base64 adds '+' and '/' which require URL encoding (becoming '%2B' and '%2F'), making short URLs longer and less readable. Base62 produces clean URLs that work in any context — emails, SMS, social media — without encoding issues.

At what scale does this architecture need upgrading?

The counter-based approach handles up to ~20K RPS comfortably. Beyond that, two bottlenecks emerge: (1) the atomic counter becomes contention-hot at 10K+ writes/sec (needs range-based allocation as in v3), and (2) geographic latency becomes a factor for global users (needs CDN as in v3). Below 20K RPS, this is the right architecture — simple, well-understood, and cost-effective.

Related Templates

TinyURL — Naive Single-Server TinyURL — Production Multi-Region TinyURL — Serverless (Lambda + DynamoDB)

Discussion

Ready to design your own TinyURL?

Open the simulator, place components on the canvas, wire them up, and run a traffic simulation to see how your architecture performs under real load.

Open Simulator