Vetora logo
☁️Database Families

Object Storage (S3, GCS)

Object storage provides a flat namespace of buckets and keys for storing unstructured data (images, videos, backups, logs) at virtually unlimited scale. Amazon S3 and Google Cloud Storage offer 11 nines of durability, storage tiering for cost optimization, and event-driven integrations that make them the backbone of modern data architectures.

Overview

Object storage is the dominant storage paradigm for unstructured data in cloud-native architectures. Unlike file systems (which organize data in hierarchical directories) or block storage (which provides raw disk volumes), object storage uses a flat namespace: each object is identified by a bucket (container) and a key (unique identifier within the bucket), with no concept of directories or hierarchy. An object consists of the data itself (a blob of bytes up to 5 TB in S3), metadata (key-value pairs describing the object), and a unique identifier. This simplicity enables object storage systems to scale to exabytes of data across millions of objects without the metadata overhead of hierarchical file systems.

Amazon S3 (Simple Storage Service), launched in 2006, pioneered cloud object storage and remains the industry standard. S3 provides 99.999999999% (11 nines) durability by automatically replicating each object across a minimum of three Availability Zones within a region. This means that for every 10 million objects stored, you can expect to lose one object every 10,000 years. S3 provides strong read-after-write consistency as of December 2020 -- previously, S3 offered eventual consistency for overwrite PUTs and DELETEs, which caused subtle bugs in data pipelines. Google Cloud Storage provides equivalent durability and consistency guarantees with a compatible API and competitive pricing.

Storage classes (or tiers) are a defining feature of object storage, enabling dramatic cost optimization based on access patterns. S3 offers six storage classes: Standard (frequent access, lowest latency), Intelligent-Tiering (automatic movement between tiers based on access patterns), Standard-IA (Infrequent Access, lower storage cost, higher retrieval cost), One Zone-IA (single-AZ, lower cost), Glacier Instant Retrieval (archive with millisecond retrieval), Glacier Flexible Retrieval (archive with minutes-to-hours retrieval), and Glacier Deep Archive (lowest cost, 12-48 hour retrieval). Lifecycle policies automatically transition objects between tiers based on age -- for example, moving access logs from Standard to Standard-IA after 30 days and to Glacier after 90 days. This tiered approach can reduce storage costs by 60-90% for data with declining access frequency.

Object storage has become the foundation of modern data architectures beyond simple blob storage. Data lakes store structured and semi-structured data (Parquet, ORC, JSON) in S3, queried directly by engines like Athena, Spark, and Trino without loading into a database. The lakehouse pattern (Delta Lake, Apache Iceberg) adds ACID transactions and schema enforcement on top of object storage, enabling warehouse-like capabilities at data lake costs. Event notifications (S3 Event Notifications, GCS Pub/Sub) trigger serverless functions when objects are created, deleted, or modified, enabling event-driven processing pipelines. Presigned URLs provide time-limited, credential-free access to specific objects, enabling secure direct uploads from browsers or mobile apps without exposing storage credentials. Multipart upload enables parallel upload of large files by splitting them into parts, uploading parts independently (and retrying failed parts), and assembling them server-side.

Key Points
  • 1The flat namespace (bucket + key) eliminates hierarchical metadata overhead, enabling object storage to scale to exabytes. What appears to be a directory structure (s3://bucket/images/2024/photo.jpg) is actually a single key string -- there is no 'images' directory, just an object whose key contains slashes.
  • 211 nines of durability (99.999999999%) is achieved by replicating each object across a minimum of 3 Availability Zones. This durability level means losing approximately 1 object per 10 billion objects per year -- more durable than any on-premises storage system.
  • 3Storage classes enable 60-90% cost savings by matching access patterns to pricing tiers. Lifecycle policies automate transitions: 'move to IA after 30 days, Glacier after 90 days, Deep Archive after 365 days' runs without manual intervention and can be configured per key prefix.
  • 4Strong read-after-write consistency (S3 since December 2020) guarantees that a GET immediately after a PUT returns the latest version. This eliminated a major source of data pipeline bugs where list operations or reads returned stale data.
  • 5Presigned URLs provide time-limited, credential-free access to specific objects. This enables patterns like direct browser uploads to S3 (bypassing the application server for large files) and temporary download links shared via email or messaging -- all without exposing AWS credentials.
  • 6Multipart upload splits large objects into parts (minimum 5 MB each, up to 10,000 parts) that are uploaded independently and in parallel. Failed parts can be retried without re-uploading the entire object. This is essential for reliable upload of files larger than 100 MB over unreliable networks.
Simple Example

The Airport Luggage System Analogy

Object storage works like an airport luggage storage service. You hand over your bag (object) and receive a claim ticket with a number (key) and the terminal name (bucket). To retrieve your bag, you present the terminal name and ticket number -- the system does not care about the bag's contents, size, or type. You cannot open the bag and search inside it from the storage counter (no query capability on the blob). You pay based on how long the bag is stored and how large it is (storage cost), plus a fee each time you check or retrieve it (API cost). If you know you will not need the bag for months, you can store it in a cheaper back warehouse (Glacier) -- it takes longer to retrieve, but costs a fraction of the front counter storage.

Real-World Examples

Netflix

Netflix stores all media assets -- source masters, transcoded video files, artwork, and subtitles -- in Amazon S3. Each movie or TV show has hundreds of encoded variants (different resolutions, codecs, and audio tracks) totaling petabytes of data. S3's durability guarantees mean Netflix does not need to maintain backup copies of master content. CloudFront CDN caches popular content at edge locations, but S3 serves as the authoritative source for all media, with lifecycle policies archiving older, less-accessed content to lower-cost storage tiers.

Dropbox

Dropbox originally stored all user files in Amazon S3, but as they grew to exabyte scale, they built Magic Pocket -- a custom S3-compatible object storage system running on their own hardware. Magic Pocket replicates data across multiple data centers with erasure coding (instead of full replication) for better storage efficiency. The S3-compatible API means Dropbox's application code did not need to change. This migration reduced Dropbox's storage costs by approximately 50% compared to S3 pricing, demonstrating the economics of building custom infrastructure at extreme scale.

Airbnb

Airbnb stores over 10 petabytes of images in Amazon S3, including listing photos, user profile images, and user-generated content. Images are uploaded via presigned URLs directly from the mobile app to S3 (bypassing Airbnb's application servers), then processed by a serverless pipeline triggered by S3 event notifications -- resizing, optimizing, and generating thumbnails. Lifecycle policies move rarely-accessed older listing images to Standard-IA storage class, reducing storage costs by 40% without affecting retrieval performance for active listings.

Trade-Offs
AspectDescription
Durability vs LatencyObject storage provides extreme durability (11 nines) through multi-AZ replication, but first-byte latency is typically 50-200ms -- orders of magnitude slower than local SSD (0.1ms) or block storage (1-5ms). Object storage is not suitable for low-latency random access patterns. Use a CDN or caching layer for latency-sensitive access to frequently-read objects.
Scalability vs Query CapabilityObject storage scales to exabytes with no capacity planning, but you can only access objects by exact key. There is no ability to query object contents, filter by metadata efficiently at scale, or perform JOINs. Query engines (Athena, Spark) provide SQL-like access to structured data stored in S3, but at higher latency than a database query.
Cost Optimization vs Retrieval SpeedStorage tiering (Standard -> IA -> Glacier -> Deep Archive) can reduce costs by 90%, but lower tiers have higher retrieval costs and longer retrieval times (minutes to hours for Glacier, up to 48 hours for Deep Archive). Lifecycle policies must be carefully designed to avoid moving frequently-accessed data to cold tiers, which would increase total cost due to retrieval fees.
Immutability vs Update PatternsObjects in S3 are immutable -- you cannot update part of an object, only replace it entirely. This is ideal for write-once data (images, videos, log archives) but awkward for data that changes frequently. Workloads requiring frequent small updates are better served by a database or block storage. The lakehouse pattern (Delta Lake, Iceberg) works around this by treating S3 objects as immutable data files and managing updates through metadata and compaction.
Case Study

Airbnb's Image Pipeline -- 10PB+ on S3 with Lifecycle Optimization

Scenario

Airbnb hosts millions of property listings, each with dozens of high-resolution photos uploaded by hosts. The image storage requirements grew to over 10 petabytes, with storage costs becoming a significant infrastructure expense. New listing photos are accessed frequently during the first few months (hosts checking their listings, guests browsing), but access drops dramatically after the listing's initial activity period. The upload process also created a bottleneck: routing large image files through Airbnb's application servers consumed bandwidth and compute resources.

Solution

Airbnb implemented a three-part architecture on S3: (1) Presigned URLs enable direct upload from mobile apps and browsers to S3, eliminating the application server as a bottleneck. The app requests a presigned URL from the API, then uploads the image directly to S3. (2) S3 Event Notifications trigger an AWS Lambda function when a new image is uploaded, which processes the image (resize, optimize, generate thumbnails) and stores the variants back in S3. (3) Lifecycle policies transition images from Standard to Standard-IA after 60 days and to Glacier Instant Retrieval after 180 days, matching the access frequency decline.

Outcome

Storage costs decreased by 40% through lifecycle optimization, saving millions of dollars annually. Direct-to-S3 uploads reduced application server CPU utilization by 30% and eliminated upload timeout errors for large images over slow connections. The event-driven processing pipeline processed images within seconds of upload, with automatic scaling during peak listing creation periods. S3's 11-nines durability eliminated the need for a separate backup system, simplifying the infrastructure and reducing operational overhead.

Common Mistakes
  • Treating S3 keys as a file system hierarchy. S3 has a flat namespace -- what looks like directories (photos/2024/january/) is just a key prefix. The ListObjects API with prefix filtering simulates directory listing, but there are no actual directories. Do not design systems that depend on hierarchical directory operations like rename or move.
  • Not enabling versioning for critical data. Without versioning, a DELETE or overwrite permanently destroys the previous object. Enabling S3 versioning preserves all versions, protecting against accidental deletion and enabling point-in-time recovery. Combine with lifecycle policies to expire old versions and control costs.
  • Using S3 for latency-sensitive random access. S3's first-byte latency (50-200ms) is acceptable for batch processing and large file downloads but unacceptable for database-like access patterns. Use ElastiCache, DynamoDB, or a CDN for low-latency access to frequently-read data.
  • Ignoring S3 request pricing for small objects. While storage costs are low ($0.023/GB/month for Standard), each PUT, GET, and LIST API call incurs a cost ($0.005 per 1,000 PUTs). Storing millions of tiny objects (under 1 KB) can result in API costs exceeding storage costs. Batch small objects into larger ones where possible.
Related Concepts

See Object Storage (S3, GCS) in action

Explore system design templates that use object storage (s3, gcs) and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Model object storage costs with lifecycle policies and access patterns

Metrics to watch
storage_cost_monthlyretrieval_latency_p99api_request_cost
Run Simulation
Test Your Understanding

1What does 99.999999999% (11 nines) durability mean in the context of Amazon S3?

2Why would you use presigned URLs for file uploads to S3 instead of routing uploads through your application server?

3When would moving objects to S3 Glacier Deep Archive be a poor cost optimization strategy?

Deeper Reading