1What is the purpose of an origin shield in a CDN architecture?
A Content Delivery Network caches static and dynamic content at edge Points of Presence worldwide, reducing latency by serving content close to users and reducing origin load through tiered caching, cache-key optimization, and intelligent routing.
A Content Delivery Network (CDN) is a globally distributed system of servers that caches and serves content from locations geographically close to end users. The fundamental insight is simple: the speed of light imposes a minimum latency proportional to physical distance. A user in Tokyo fetching data from a server in Virginia experiences at least 70ms of one-way latency due to the speed of light in fiber, plus routing overhead. A CDN PoP (Point of Presence) in Tokyo can serve the same content in under 5ms. This latency reduction compounds across the dozens of resources a modern web page requires, making CDNs one of the most impactful performance optimizations in system design.
CDN architectures use two primary models for populating edge caches. Push CDNs require the origin to upload content to the CDN proactively -- suitable for predictable content like software updates or video libraries, but operationally burdensome for dynamic websites. Pull CDNs (origin pull) are the dominant model: the edge server fetches content from the origin on the first request (cache miss), stores it, and serves subsequent requests from cache (cache hit). The cache key is typically the full URL plus selected Vary headers (e.g., Accept-Encoding, Accept-Language), determining which requests are considered equivalent. Cache hit ratio -- the percentage of requests served from cache without contacting the origin -- is the single most important CDN performance metric. Well-configured CDNs achieve 90-99% cache hit rates for static content.
Modern CDNs use a tiered cache hierarchy to balance hit rates against origin load. At the outermost layer, hundreds of edge PoPs serve users directly. Behind them, regional cache tiers aggregate traffic from nearby edges. At the center, an origin shield acts as a single point of contact with the actual origin server. When an edge misses, it queries the regional tier; if that misses, it queries the origin shield; only if the shield misses does the request reach the origin. This hierarchy ensures that even content with moderate popularity has a high probability of being cached somewhere in the hierarchy, and the origin sees dramatically reduced request volume. For example, if 100 edge PoPs all miss on the same URL simultaneously, the origin shield collapses these into a single origin fetch (request coalescing).
Cache invalidation is the hardest problem in CDN operations. TTL-based expiry (Cache-Control: max-age=3600) is simple and predictable but means stale content is served until the TTL expires. Purge APIs allow immediate invalidation of specific URLs or patterns, but global purge propagation across hundreds of PoPs takes 2-30 seconds. Versioned URLs (main.abc123.js) are the most reliable approach: deploying new content with a new filename ensures browsers and CDNs fetch the new version immediately, while old versions remain cached and valid for anyone still referencing them. Beyond caching, modern CDNs provide dynamic content acceleration through TCP optimization (larger initial congestion windows, persistent connections to origin), TLS termination at the edge (reducing handshake RTTs), DDoS mitigation, WAF (Web Application Firewall), and edge compute (Cloudflare Workers, Lambda@Edge) for running application logic at the edge.
The Library Branch System Analogy
A CDN works like a library system with a central collection (origin) and neighborhood branches (edge PoPs). When you request a book (web resource), your local branch checks its shelves (edge cache). If it has the book (cache hit), you get it immediately. If not (cache miss), the branch requests it from the regional distribution center (mid-tier cache), which might have it from a nearby branch's request. Only if nobody in the region has it does the request go to the central library (origin). Once any branch gets the book, it keeps a copy on its shelf for a set period (TTL). The most popular books are available at every branch instantly; rare books require a trip to the central library but are then stocked locally for the next reader.
Netflix Open Connect
Netflix built its own CDN, Open Connect, which places custom hardware appliances (Open Connect Appliances, or OCAs) directly inside ISP networks. During off-peak hours, Netflix pre-positions popular content on these appliances based on predicted viewing patterns. During peak hours (evening), up to 95% of Netflix traffic is served from OCAs within the ISP's own network, never crossing the internet backbone. This reduces Netflix's bandwidth costs and provides users with consistent, high-quality streaming regardless of internet congestion.
Cloudflare
Cloudflare operates a CDN with PoPs in 310+ cities, handling approximately 20% of all web traffic. Cloudflare's architecture uses Anycast: every PoP announces the same IP addresses, so DNS and BGP routing automatically direct users to the nearest PoP. Beyond static caching, Cloudflare Workers enables edge compute -- running JavaScript/WebAssembly at the edge for dynamic content, A/B testing, header manipulation, and API gateway logic with sub-millisecond cold starts.
Akamai
Akamai, the pioneer of CDN technology, operates 350,000+ servers across 135+ countries. Akamai's edge platform handles 30% of all web traffic and can absorb DDoS attacks exceeding 1 Tbps. Their Intelligent Platform uses real-time network measurements to route around congestion and outages, choosing optimal paths between edge PoPs and origin servers. Akamai's SureRoute technology tests multiple paths to origin every few seconds and selects the fastest one.
| Aspect | Description |
|---|---|
| Cache Hit Rate vs Content Freshness | Higher cache hit rates (longer TTLs, broader cache keys) mean more requests served from fast edge caches, but stale content may be served. Lower TTLs ensure freshness but reduce hit rates and increase origin load. Versioned URLs achieve both high hit rates and perfect freshness by treating each version as a new, immutable resource. |
| Edge Proximity vs Cache Fragmentation | More edge PoPs mean shorter distances to users, but each PoP has its own cache. Content must be requested at each PoP independently, reducing the hit rate for low-traffic PoPs. Regional cache tiers mitigate this by aggregating cache misses, but at the cost of an extra network hop on initial misses. |
| Static Caching Simplicity vs Dynamic Acceleration Complexity | Caching static assets (images, CSS, JS) is straightforward and provides the highest benefit. Accelerating dynamic content requires more complex features: TLS termination, connection pooling to origin, smart routing, and edge compute. These features add cost and configuration complexity. |
| CDN Cost vs Origin Bandwidth Cost | CDN services charge for bandwidth (egress), requests, and features. For low-traffic sites, CDN costs may exceed origin bandwidth costs. At scale (terabytes per month), CDN egress is cheaper than cloud provider egress because CDNs negotiate peering agreements. The break-even point depends on traffic volume, content cacheability, and cloud provider pricing. |
Netflix Open Connect -- Building a Purpose-Built CDN
Scenario
By 2012, Netflix accounted for over 30% of peak internet traffic in North America. Using third-party CDNs (Akamai, Limelight) was extremely expensive at this volume, and quality was inconsistent because CDN PoPs were shared with other customers. Netflix needed consistent, high-bitrate video delivery to hundreds of millions of users worldwide while controlling costs. The fundamental problem was that video files are large (2-7 GB per movie at HD quality), viewed once, and have predictable popularity patterns.
Solution
Netflix built Open Connect, a custom CDN with two components: (1) Open Connect Appliances (OCAs) -- custom hardware with large storage (100+ TB SSD) placed inside ISP data centers at no cost to the ISP. Netflix pre-fills OCAs with content predicted to be popular using viewing pattern analysis. (2) A control plane that directs each user's player to the optimal OCA based on ISP, location, and server load. Netflix fills OCAs during off-peak hours (2-6 AM) when network capacity is available, avoiding competition with peak traffic.
Outcome
Open Connect serves over 95% of Netflix traffic directly from within ISP networks, meaning video streams rarely cross the internet backbone. This reduced Netflix's bandwidth costs by an estimated 60-70% compared to third-party CDNs. Streaming quality improved because OCA-to-user paths are short and within the ISP's own network, avoiding internet congestion. ISPs benefit because Netflix traffic stays local, reducing their peering and transit costs. Open Connect handles over 100 Tbps of peak traffic globally.
See CDN Architecture in action
Explore system design templates that use cdn architecture and run traffic simulations to see how these concepts perform under real load.
Browse Templates1What is the purpose of an origin shield in a CDN architecture?
2Why are versioned URLs (e.g., main.abc123.js) preferred over query-string cache busting (e.g., main.js?v=123)?