1A horizontally-scaled web application stores user sessions in-process memory. Users report being randomly logged out. What is the root cause?
Learn how designing services without local state enables effortless horizontal scaling, simplifies deployments, and improves fault tolerance in distributed systems.
A stateless service is one that does not store any client-specific data between requests. Every request contains all the information needed for the service to process it, and the service treats each request independently. Any instance of the service can handle any request, which means instances are completely interchangeable and can be added, removed, or replaced without affecting system behavior.
Stateless design is a prerequisite for effective horizontal scaling. When a service stores state in local memory -- user sessions, shopping carts, in-progress transactions, cached computation results -- you cannot simply add more instances behind a load balancer because each instance has a different view of the world. A user who starts a checkout on instance A cannot continue on instance B because their cart data exists only on instance A. This creates 'sticky session' requirements that complicate load balancing, reduce fault tolerance, and prevent uniform request distribution.
The key insight of stateless design is the separation of compute from state. Business logic runs on stateless service instances that can be freely scaled. All persistent and session state is externalized to dedicated state stores: databases for persistent data, Redis or Memcached for session data and caches, message queues for in-progress work. These state stores have their own scaling strategies (replication, sharding, clustering) that can evolve independently of the compute tier.
Stateless services also simplify deployments significantly. Rolling updates can replace instances one at a time without draining sessions or migrating state. Blue-green deployments can switch all traffic to a new fleet instantly because no state needs to be transferred. Canary deployments can route a percentage of traffic to new instances without worrying about session affinity. In a containerized environment like Kubernetes, stateless services are the natural fit for Deployments with horizontal pod autoscaling, while stateful workloads require the more complex StatefulSet abstraction.
The Cashier Window Analogy
Consider a bank with multiple teller windows. In a stateless model, you bring all your documents (ID, account number, transaction details) to whichever window is available. Any teller can serve you because you carry everything they need. If a teller goes on break, you simply go to another window without losing anything. In a stateful model, you start a complex transaction at window 3 and the teller keeps your documents in a pile on their desk. If that teller goes on break, your transaction is stuck until they return. The stateless model lets the bank add or remove tellers freely based on the current queue length -- exactly how auto-scaling works for stateless services.
Netflix
Netflix's streaming API is composed of hundreds of stateless microservices running on AWS. Each service instance is ephemeral and can be replaced at any moment. Session state (user preferences, playback position, watch history) is stored in Cassandra and EVCache (Netflix's memcached-based distributed cache). This architecture allows Netflix to perform rolling deployments of new code to thousands of instances without service interruption.
Stripe
Stripe's payment processing API is designed as a set of stateless services. Every API request includes authentication credentials and all necessary payment data. Idempotency keys ensure that retried requests (after network failures) do not create duplicate charges. The stateless design enables Stripe to handle massive transaction volumes by simply adding more API server instances behind their load balancers during peak periods.
Vercel / Next.js
Vercel's serverless platform epitomizes stateless design. Each incoming request is handled by a fresh function instance with no knowledge of previous requests. This extreme statelessness enables Vercel to scale from zero to thousands of concurrent executions instantly and charge only for actual compute usage. Application state is stored in external services like Vercel KV (Redis), Vercel Postgres, or third-party databases.
| Aspect | Description |
|---|---|
| Latency from External State | Every request that needs state must fetch it from an external store, adding network round-trip latency. In-memory state on the same machine would be microseconds; a Redis lookup is typically 0.5-2ms; a database query is 1-10ms. Caching frequently-accessed state in a fast external cache (Redis) mitigates this overhead. |
| Operational Complexity | While stateless services are simple to scale and deploy, the external state stores they depend on introduce their own operational complexity. Managing Redis clusters, database replication, and cache invalidation requires specialized expertise and monitoring. |
| Data Consistency | When multiple stateless instances read and write shared state in external stores, race conditions can occur. Techniques like optimistic concurrency control, distributed locks, or compare-and-swap operations are needed to maintain consistency without the implicit serialization that comes with single-instance stateful processing. |
| Token Size and Security | Carrying state in client tokens (JWTs) avoids server-side session storage but increases request payload size and requires careful security management. Tokens cannot be invalidated server-side without reintroducing server-side state (a token blocklist), and large tokens with many claims consume bandwidth on every request. |
Airbnb's Transition from Stateful Rails to Stateless SOA
Scenario
Airbnb's initial Ruby on Rails monolith stored session data in server-side memory and used sticky sessions to route users to the same application server. This made it difficult to scale horizontally, perform rolling deployments (active sessions would be lost), and survive server failures (users would be logged out). During peak booking seasons, the limited number of sticky-session-capable servers became a bottleneck.
Solution
Airbnb migrated session storage from in-memory Rails sessions to a centralized Redis cluster. They introduced JWT-based authentication for API services, allowing any service instance to verify user identity without accessing a session store. The monolith was decomposed into stateless microservices, each maintaining no local state. Service discovery and load balancing were handled by an API gateway with round-robin distribution, since session affinity was no longer required.
Outcome
The stateless architecture enabled Airbnb to horizontally scale their platform from handling thousands to millions of bookings per day. Deployments became seamless -- new code could be rolled out across the fleet without draining sessions or coordinating instance replacement. Individual instance failures became invisible to users because requests were automatically routed to healthy instances. The Redis session store added approximately 1ms of latency per authenticated request, which was negligible compared to the operational and scalability benefits gained.
See Stateless Service Design in action
Explore system design templates that use stateless service design and run traffic simulations to see how these concepts perform under real load.
Browse Templates1A horizontally-scaled web application stores user sessions in-process memory. Users report being randomly logged out. What is the root cause?
2Which approach correctly makes a file-upload service stateless?