Vetora logo
Cloud-Native

Serverless & FaaS

Serverless computing abstracts away server management entirely, letting developers deploy individual functions that scale automatically from zero to millions of invocations. Function-as-a-Service (FaaS) platforms like AWS Lambda charge per-invocation with sub-second billing granularity.

Overview

Serverless computing represents a shift in the cloud abstraction ladder: from managing servers (IaaS) to managing containers (CaaS) to managing nothing but code (FaaS). In a serverless model, the cloud provider handles all infrastructure concerns -- provisioning, patching, scaling, and high availability. Developers write functions (small units of code with a single entry point), configure event triggers, and deploy. The platform instantiates the function on demand, scales it horizontally to match incoming request volume, and tears it down when idle. There are no idle servers consuming budget.

Function-as-a-Service is the most common serverless execution model. AWS Lambda (launched 2014) popularized the approach: you upload a function handler, define triggers (API Gateway, S3 events, SQS messages, CloudWatch schedules), and Lambda handles everything else. Functions run in isolated micro-VMs (Firecracker), start within 100-500ms for warm invocations, and auto-scale to 1,000+ concurrent executions (10,000+ with reserved concurrency). Google Cloud Functions, Azure Functions, and Cloudflare Workers offer similar models with varying runtime support, execution limits, and pricing.

Beyond FaaS, the serverless ecosystem includes managed databases (DynamoDB, Aurora Serverless, Neon), messaging (EventBridge, SNS), storage (S3), and orchestration (Step Functions, Durable Functions). A fully serverless architecture composes these managed services: an API Gateway triggers Lambda functions, which read/write to DynamoDB, publish events to EventBridge, and orchestrate multi-step workflows via Step Functions. This composition eliminates operational overhead but creates deep vendor coupling.

Serverless is not universally optimal. Cold starts add latency (1-10 seconds for JVM or .NET runtimes, 100-300ms for Node.js/Python). Functions have execution time limits (15 minutes on Lambda). Statelessness means every function invocation starts fresh -- connection pooling, caching, and session state must be externalized. Debugging distributed function chains is harder than debugging a monolith or even microservices. For latency-sensitive, long-running, or high-throughput steady-state workloads, containers on Kubernetes are typically more performant and cost-effective.

Key Points
  • 1FaaS functions are stateless, event-driven, and ephemeral. Each invocation is independent -- state must be stored in external services (DynamoDB, Redis, S3). This statelessness enables unlimited horizontal scaling.
  • 2Cold starts occur when the platform provisions a new execution environment (download code, init runtime, load dependencies). Latency ranges from ~100ms (Node.js, Python) to 5-10s (Java, .NET). Provisioned concurrency pre-warms instances to eliminate cold starts for latency-sensitive paths.
  • 3Pricing is per-invocation ($0.20 per 1M requests on Lambda) plus per-GB-second of compute ($0.0000166 per GB-s). At low-to-moderate traffic, serverless costs 70-90% less than equivalent always-on infrastructure. At high steady-state throughput, containers become cheaper.
  • 4Execution limits constrain workloads: 15-minute max duration (Lambda), 10 GB memory, 10 GB ephemeral storage. Long-running tasks must be decomposed into chains via Step Functions or queue-based fan-out.
  • 5Event sources include HTTP (API Gateway), messaging (SQS, SNS, Kafka), storage (S3 events), database (DynamoDB Streams), and scheduling (EventBridge Scheduler). The event-driven model naturally fits asynchronous, reactive architectures.
  • 6Vendor lock-in is the primary strategic concern. A Lambda function calling DynamoDB, Step Functions, and EventBridge is deeply coupled to AWS. Abstraction layers (Serverless Framework, SST, Terraform) reduce but do not eliminate lock-in.
Simple Example

Vending Machine Analogy

A serverless function is like a vending machine: it sits idle (costing nothing) until someone inserts money and selects an item (an event trigger). The machine dispenses the product (executes the function) and returns to idle. You do not pay rent on a store, hire a cashier, or manage inventory shelves -- the machine handles everything. If 100 people arrive simultaneously, imagine 100 machines instantly appearing to serve them (auto-scaling), then disappearing when the crowd leaves (scale to zero). The trade-off: you cannot customize the machine's interior (runtime environment), items must fit in standard slots (execution limits), and the first vend after a long idle period takes a few extra seconds while the machine warms up (cold start).

Real-World Examples

Netflix

Netflix uses AWS Lambda for media processing pipelines: when a new title is uploaded to S3, Lambda functions trigger a cascade of tasks -- video transcoding (via Step Functions coordinating hundreds of parallel functions), thumbnail generation, subtitle processing, and metadata extraction. This event-driven pipeline processes thousands of titles daily with zero capacity planning, scaling from idle overnight to thousands of concurrent functions during content ingestion peaks.

iRobot

iRobot processes 1 billion+ IoT events per month from Roomba robots using a fully serverless architecture on AWS. MQTT messages from robots flow through IoT Core to Lambda functions that process telemetry, update robot state in DynamoDB, and trigger cleaning map generation. The architecture scales from near-zero traffic at 3 AM to peak load at 6 PM when users arrive home and start their robots, with costs directly proportional to actual usage.

Coca-Cola

Coca-Cola migrated their vending machine payment processing from EC2 instances to AWS Lambda, reducing operational costs by 65%. Each vending machine transaction triggers a Lambda function that processes payment, updates inventory, and sends telemetry. The serverless architecture eliminated the need for a 24/7 ops team managing servers for a workload that peaks during lunch hours and is near-zero at night.

Trade-Offs
AspectDescription
Cost at Low Volume vs. High VolumeServerless is dramatically cheaper for spiky, low-traffic workloads (scale-to-zero means zero cost when idle). At sustained high throughput (>1M requests/hour), containers on reserved instances become 3-5x cheaper because you are paying per-invocation overhead and cannot amortize fixed costs. The crossover point depends on traffic pattern, not just volume.
Developer Velocity vs. Operational ControlServerless eliminates infrastructure management, letting developers ship features faster. But it removes control over runtime environment, connection pooling, warm-up behavior, and resource placement. Debugging production issues requires cloud-native observability (X-Ray, CloudWatch) rather than familiar tools (strace, tcpdump, SSH).
Cold Start Latency vs. CostCold starts add 100ms-10s of latency for the first invocation after idle. Provisioned concurrency eliminates cold starts but re-introduces always-on costs (you are paying for pre-warmed instances). Choosing the right runtime (Node.js/Python for low cold starts vs. Java/C# for CPU performance) is a critical design decision.
Simplicity vs. Vendor Lock-inA fully serverless architecture (Lambda + DynamoDB + Step Functions + EventBridge) is operationally simple but deeply coupled to one cloud provider. Multi-cloud portability requires abstraction layers that negate some simplicity benefits. Pragmatically, most organizations accept single-cloud lock-in for the operational advantages.
Case Study

Zalando's Serverless Transformation

Scenario

Zalando, Europe's largest online fashion platform, operated 500+ microservices on Kubernetes. While K8s worked well for steady-state workloads, event-driven tasks (order notifications, inventory updates, image processing) required teams to maintain always-on services that were idle 80% of the time. Each team spent 20-30% of engineering time on infrastructure management rather than business logic.

Solution

Zalando adopted AWS Lambda for event-driven workloads while keeping latency-sensitive APIs on Kubernetes. They built an internal serverless framework (Zappa-based) that standardized function packaging, deployment, and observability. Event-driven workloads (order processing, email notifications, image thumbnailing) were migrated to Lambda triggered by SQS queues and S3 events. Step Functions orchestrated multi-step order fulfillment workflows. Teams used DynamoDB for event state and kept PostgreSQL on RDS for transactional data.

Outcome

Infrastructure costs for event-driven workloads dropped by 70% due to scale-to-zero economics. Engineering teams reclaimed 20-30% of their time previously spent on infrastructure management. Deployment frequency increased by 40% because serverless functions are simpler to deploy than containerized services. The hybrid architecture (K8s for APIs, Lambda for events) gave teams the right tool for each workload pattern.

Common Mistakes
  • Using serverless for latency-sensitive synchronous APIs without provisioned concurrency. Cold starts of 1-10 seconds are unacceptable for user-facing APIs at the P99 level. Either use provisioned concurrency (which adds cost) or keep these workloads on containers.
  • Building long function chains without orchestration. Chaining Lambda functions via direct invocation creates brittle, hard-to-debug pipelines. Use Step Functions or event-driven choreography (SQS between functions) for multi-step workflows -- they handle retries, error handling, and state tracking.
  • Ignoring the 15-minute execution limit. Functions that process large files or run complex computations may silently time out. Design for chunk-based processing: split large inputs into smaller messages and fan out across parallel function invocations.
  • Over-relying on API Gateway + Lambda for all traffic. API Gateway has a 29-second timeout and costs $3.50 per million requests. For high-throughput internal APIs, Application Load Balancer ($0.40 per million) or direct SDK invocation is significantly cheaper.
Related Concepts

See Serverless & FaaS in action

Explore system design templates that use serverless & faas and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Measure cold start latency in a serverless URL shortener

Metrics to watch
cold_start_mswarm_invocation_msconcurrent_executionscost_per_invocation
Run Simulation
Test Your Understanding

1What is a cold start in the context of serverless functions?

2When does a container-based architecture become more cost-effective than serverless?

Deeper Reading