Vetora logo
🔗Interview Toolkit

Interview Walkthrough: URL Shortener

A complete interview walkthrough for the URL shortener problem -- the most commonly asked system design question. Covers requirements gathering, scale estimation, API design, encoding strategies, caching, and collision handling.

Overview

The URL shortener is the most frequently asked system design interview question because it is deceptively simple on the surface but reveals significant depth when explored. The core problem is straightforward: given a long URL, produce a short alias that redirects users to the original. But once you start quantifying scale, choosing an encoding strategy, and designing for availability, the problem expands into a rich discussion of distributed systems trade-offs.

Start by clarifying functional requirements: shorten a URL, redirect via short code, support custom aliases, set optional expiration, and provide basic click analytics. Then quantify scale. If the service creates 100 million URLs per day and the read-to-write ratio is 10:1, that means 1 billion redirects per day, or roughly 12,000 QPS for reads and 1,200 QPS for writes. Over five years, the system stores approximately 180 billion URLs. Each record is small (roughly 500 bytes for short_code, long_url, created_at, user_id, and expiry), yielding about 90 TB of total storage. These numbers immediately tell you that reads dominate, caching is essential, and the database must be horizontally partitioned.

The API surface is minimal. A POST to /api/v1/urls with a JSON body containing the long URL (and optional custom alias and expiry) returns the short URL. A GET to /:shortCode returns an HTTP 301 (permanent redirect) or 302 (temporary redirect, better for analytics) pointing to the original URL. The 301 vs 302 decision matters: 301 allows browsers to cache the redirect, reducing server load but losing visibility into click counts. Most production shorteners use 302 or 307 to ensure every click hits the server for analytics.

The encoding strategy is the heart of the design. Three common approaches exist. First, base62 encoding of an auto-incremented counter: a centralized or distributed counter generates a unique integer, which is converted to a 7-character base62 string (62^7 = 3.5 trillion possible codes). This guarantees uniqueness but requires a coordinated counter, which is a potential bottleneck. Second, MD5 or SHA-256 hashing of the long URL, truncated to 7 characters: this is stateless and can run on any node, but truncation creates collision risk that must be handled by checking the database and appending characters if needed. Third, a pre-generated key service that creates random keys in batch and stores them in a pool; workers claim keys from the pool, eliminating both the counter bottleneck and the collision check. Each approach involves a different trade-off between coordination, collision risk, and operational complexity.

Key Points
  • 1A 7-character base62 code provides 3.5 trillion unique short URLs, more than sufficient for most realistic scale projections. Shorter codes are better for usability, but must be balanced against the collision probability at high write volume.
  • 2The read-to-write ratio (typically 10:1 to 100:1) means caching is the highest-impact optimization. A Redis cache with an 80-90% hit rate absorbs the vast majority of redirect traffic and keeps database QPS manageable.
  • 3Choose 302 (temporary redirect) over 301 (permanent redirect) if you need click analytics, because 301 allows browsers to cache the redirect and skip the server entirely on subsequent visits.
  • 4Collision handling depends on the encoding strategy. Counter-based approaches are collision-free by construction. Hash-based approaches must check for collisions in the database and retry with a modified input or appended salt.
  • 5Database sharding by short_code hash distributes both storage and query load evenly. Range-based sharding on short_code works but can create hotspots if codes are generated sequentially.
  • 6Analytics can be handled asynchronously: log each redirect event to Kafka, then consume and aggregate in a batch or streaming pipeline rather than writing to the analytics database synchronously during the redirect path.
Simple Example

The Coat Check Analogy

A URL shortener works like a coat check at a concert venue. You hand over your bulky coat (the long URL) and receive a small numbered ticket (the short code). When you return with the ticket, the attendant retrieves your coat instantly by looking up the number. The attendant does not need to examine every coat -- the ticket maps directly to a storage slot. If the venue is very popular, they might have multiple coat check stations (database shards) each handling a range of ticket numbers, and a front desk (cache) that remembers the most recently checked coats for faster retrieval.

Real-World Examples

Bitly

Bitly has shortened over 10 billion links and handles billions of redirects per month. Their architecture uses a global CDN for edge-level caching of popular redirects, a distributed key-value store for the URL mapping, and a real-time analytics pipeline built on Kafka and Druid. Bitly's system demonstrates the importance of caching at the edge -- the majority of redirects never reach the origin database.

TinyURL

TinyURL is one of the oldest URL shorteners, launched in 2002. Despite its simplicity, it has proven remarkably durable, handling billions of redirects over two decades. TinyURL uses a straightforward counter-based approach with base62 encoding, demonstrating that a simple, well-executed design can scale effectively when paired with proper caching and database optimization.

Twitter (t.co)

Twitter's t.co service automatically wraps every URL posted on the platform in a short link. This serves dual purposes: it normalizes URL length for the character limit, and it provides click tracking and malware scanning. t.co processes hundreds of thousands of redirects per second at peak, using an in-memory cache layer that holds the most recent and most popular mappings.

Trade-Offs
AspectDescription
Counter-Based vs Hash-Based EncodingCounter-based encoding guarantees uniqueness without collision checks but requires a centralized or coordinated counter (ZooKeeper, database sequence, or Snowflake-style ID generator). Hash-based encoding is stateless and can run on any node but requires collision detection and resolution, adding a database read to the write path.
301 vs 302 RedirectA 301 permanent redirect tells browsers and search engines to cache the mapping, reducing server load but sacrificing click analytics visibility. A 302 temporary redirect ensures every click passes through the server for counting and malware scanning but increases QPS on the redirect service.
Short Code Length vs Collision RateShorter codes are more user-friendly and easier to share, but a smaller keyspace increases the probability of collisions and limits the total number of URLs the system can store. A 6-character base62 code supports 56 billion URLs; 7 characters supports 3.5 trillion.
Synchronous vs Asynchronous AnalyticsRecording click analytics synchronously on the redirect path adds latency to every redirect. Asynchronous recording via a message queue (Kafka) decouples analytics from the redirect path, keeping redirects fast, but introduces a delay before analytics data is available.
Case Study

Bitly's Migration to Edge-Cached Redirects

Scenario

Bitly experienced exponential growth in redirect traffic, with popular links generating millions of clicks in minutes during viral events. Their origin-based redirect architecture could not scale fast enough during traffic spikes, leading to increased latency and occasional timeouts. The database layer, despite aggressive caching in Redis, was being overwhelmed by cache misses during the long tail of less-popular links.

Solution

Bitly deployed a multi-tier caching architecture. The first tier is a global CDN (edge caching) that stores redirect mappings at points of presence worldwide. The second tier is a regional Redis cluster that handles cache misses from the CDN. The third tier is the origin database, which only serves requests that miss both the CDN and Redis caches. Popular links are cached at the edge with a short TTL (5 minutes), ensuring that viral traffic never reaches the origin. A background process pre-warms the CDN cache for links that are trending based on real-time click velocity.

Outcome

Edge caching absorbed over 95% of redirect traffic, reducing origin database QPS by 20x. P99 redirect latency dropped from 120ms to under 15ms globally because most redirects were served from the nearest CDN edge location. During viral events, the system handled 10x normal traffic without any degradation because the CDN auto-scaled at the edge. Infrastructure costs decreased by 40% because fewer origin servers were needed to handle the residual traffic.

Common Mistakes
  • Jumping straight into the encoding algorithm without first clarifying requirements and estimating scale. Interviewers want to see structured thinking: start with functional requirements, non-functional requirements, and back-of-the-envelope calculations before diving into the design.
  • Using MD5 or SHA-256 without discussing collision handling. Hash truncation is not collision-free, and the interviewer expects you to explain how the system detects and resolves collisions (e.g., retry with a salt, append characters, or use a bloom filter for fast existence checks).
  • Forgetting about the read path entirely. Many candidates spend all their time on URL creation and neglect the redirect flow, which is the dominant traffic pattern. Discuss caching strategy, redirect HTTP status codes, and latency optimization for reads.
  • Ignoring expiration and cleanup. URLs with expiration dates need a background process to delete expired entries and reclaim short codes. Without this, the keyspace gradually fills up and storage grows unbounded.
Related Concepts

See Interview Walkthrough: URL Shortener in action

Explore system design templates that use interview walkthrough: url shortener and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Run the URL shortener simulation to see bottlenecks at scale

Metrics to watch
p99_latency_mscache_hit_ratethroughput_rps
Run Simulation
Test Your Understanding

1Why do most production URL shorteners use HTTP 302 instead of 301 for redirects?

2A URL shortener uses base62 encoding with 7-character codes. Approximately how many unique short URLs can it support?

3What is the main disadvantage of using a centralized auto-increment counter for generating short codes?

Deeper Reading