1Why do most production URL shorteners use HTTP 302 instead of 301 for redirects?
A complete interview walkthrough for the URL shortener problem -- the most commonly asked system design question. Covers requirements gathering, scale estimation, API design, encoding strategies, caching, and collision handling.
The URL shortener is the most frequently asked system design interview question because it is deceptively simple on the surface but reveals significant depth when explored. The core problem is straightforward: given a long URL, produce a short alias that redirects users to the original. But once you start quantifying scale, choosing an encoding strategy, and designing for availability, the problem expands into a rich discussion of distributed systems trade-offs.
Start by clarifying functional requirements: shorten a URL, redirect via short code, support custom aliases, set optional expiration, and provide basic click analytics. Then quantify scale. If the service creates 100 million URLs per day and the read-to-write ratio is 10:1, that means 1 billion redirects per day, or roughly 12,000 QPS for reads and 1,200 QPS for writes. Over five years, the system stores approximately 180 billion URLs. Each record is small (roughly 500 bytes for short_code, long_url, created_at, user_id, and expiry), yielding about 90 TB of total storage. These numbers immediately tell you that reads dominate, caching is essential, and the database must be horizontally partitioned.
The API surface is minimal. A POST to /api/v1/urls with a JSON body containing the long URL (and optional custom alias and expiry) returns the short URL. A GET to /:shortCode returns an HTTP 301 (permanent redirect) or 302 (temporary redirect, better for analytics) pointing to the original URL. The 301 vs 302 decision matters: 301 allows browsers to cache the redirect, reducing server load but losing visibility into click counts. Most production shorteners use 302 or 307 to ensure every click hits the server for analytics.
The encoding strategy is the heart of the design. Three common approaches exist. First, base62 encoding of an auto-incremented counter: a centralized or distributed counter generates a unique integer, which is converted to a 7-character base62 string (62^7 = 3.5 trillion possible codes). This guarantees uniqueness but requires a coordinated counter, which is a potential bottleneck. Second, MD5 or SHA-256 hashing of the long URL, truncated to 7 characters: this is stateless and can run on any node, but truncation creates collision risk that must be handled by checking the database and appending characters if needed. Third, a pre-generated key service that creates random keys in batch and stores them in a pool; workers claim keys from the pool, eliminating both the counter bottleneck and the collision check. Each approach involves a different trade-off between coordination, collision risk, and operational complexity.
The Coat Check Analogy
A URL shortener works like a coat check at a concert venue. You hand over your bulky coat (the long URL) and receive a small numbered ticket (the short code). When you return with the ticket, the attendant retrieves your coat instantly by looking up the number. The attendant does not need to examine every coat -- the ticket maps directly to a storage slot. If the venue is very popular, they might have multiple coat check stations (database shards) each handling a range of ticket numbers, and a front desk (cache) that remembers the most recently checked coats for faster retrieval.
Bitly
Bitly has shortened over 10 billion links and handles billions of redirects per month. Their architecture uses a global CDN for edge-level caching of popular redirects, a distributed key-value store for the URL mapping, and a real-time analytics pipeline built on Kafka and Druid. Bitly's system demonstrates the importance of caching at the edge -- the majority of redirects never reach the origin database.
TinyURL
TinyURL is one of the oldest URL shorteners, launched in 2002. Despite its simplicity, it has proven remarkably durable, handling billions of redirects over two decades. TinyURL uses a straightforward counter-based approach with base62 encoding, demonstrating that a simple, well-executed design can scale effectively when paired with proper caching and database optimization.
Twitter (t.co)
Twitter's t.co service automatically wraps every URL posted on the platform in a short link. This serves dual purposes: it normalizes URL length for the character limit, and it provides click tracking and malware scanning. t.co processes hundreds of thousands of redirects per second at peak, using an in-memory cache layer that holds the most recent and most popular mappings.
| Aspect | Description |
|---|---|
| Counter-Based vs Hash-Based Encoding | Counter-based encoding guarantees uniqueness without collision checks but requires a centralized or coordinated counter (ZooKeeper, database sequence, or Snowflake-style ID generator). Hash-based encoding is stateless and can run on any node but requires collision detection and resolution, adding a database read to the write path. |
| 301 vs 302 Redirect | A 301 permanent redirect tells browsers and search engines to cache the mapping, reducing server load but sacrificing click analytics visibility. A 302 temporary redirect ensures every click passes through the server for counting and malware scanning but increases QPS on the redirect service. |
| Short Code Length vs Collision Rate | Shorter codes are more user-friendly and easier to share, but a smaller keyspace increases the probability of collisions and limits the total number of URLs the system can store. A 6-character base62 code supports 56 billion URLs; 7 characters supports 3.5 trillion. |
| Synchronous vs Asynchronous Analytics | Recording click analytics synchronously on the redirect path adds latency to every redirect. Asynchronous recording via a message queue (Kafka) decouples analytics from the redirect path, keeping redirects fast, but introduces a delay before analytics data is available. |
Bitly's Migration to Edge-Cached Redirects
Scenario
Bitly experienced exponential growth in redirect traffic, with popular links generating millions of clicks in minutes during viral events. Their origin-based redirect architecture could not scale fast enough during traffic spikes, leading to increased latency and occasional timeouts. The database layer, despite aggressive caching in Redis, was being overwhelmed by cache misses during the long tail of less-popular links.
Solution
Bitly deployed a multi-tier caching architecture. The first tier is a global CDN (edge caching) that stores redirect mappings at points of presence worldwide. The second tier is a regional Redis cluster that handles cache misses from the CDN. The third tier is the origin database, which only serves requests that miss both the CDN and Redis caches. Popular links are cached at the edge with a short TTL (5 minutes), ensuring that viral traffic never reaches the origin. A background process pre-warms the CDN cache for links that are trending based on real-time click velocity.
Outcome
Edge caching absorbed over 95% of redirect traffic, reducing origin database QPS by 20x. P99 redirect latency dropped from 120ms to under 15ms globally because most redirects were served from the nearest CDN edge location. During viral events, the system handled 10x normal traffic without any degradation because the CDN auto-scaled at the edge. Infrastructure costs decreased by 40% because fewer origin servers were needed to handle the residual traffic.
See Interview Walkthrough: URL Shortener in action
Explore system design templates that use interview walkthrough: url shortener and run traffic simulations to see how these concepts perform under real load.
Browse Templates1Why do most production URL shorteners use HTTP 302 instead of 301 for redirects?
2A URL shortener uses base62 encoding with 7-character codes. Approximately how many unique short URLs can it support?
3What is the main disadvantage of using a centralized auto-increment counter for generating short codes?