1An e-commerce platform has a product catalog, a shopping cart, and a payment service. Which consistency assignment is most appropriate?
The consistency vs availability trade-off is the most consequential decision in distributed system design. This concept provides a practical decision framework for choosing where on the consistency-availability spectrum each component of your system should sit, with concrete criteria for financial, e-commerce, social, and real-time systems.
While the CAP theorem establishes the theoretical impossibility of having both strong consistency and high availability during network partitions, the practical engineering challenge is deciding where each component of your system should sit on the consistency-availability spectrum. This is not a single global decision -- a well-designed system uses different consistency levels for different data and operations. The bank balance needs linearizability; the user's display name can be eventually consistent; the analytics pipeline can tolerate hours of lag.
The key insight is that consistency and availability are not binary states but continuous spectra. On the consistency side, the spectrum runs from linearizability (strongest) through sequential consistency, causal consistency, read-your-writes, monotonic reads, to eventual consistency (weakest). On the availability side, systems range from 99.0% (87 hours downtime/year) through 99.99% (52 minutes/year) to 99.999% (5 minutes/year). Moving along either axis has cost implications: higher consistency requires more coordination (latency, throughput overhead), while higher availability requires more replicas and more sophisticated failover.
A practical decision framework for choosing the right point on the spectrum involves four questions. First: what is the business cost of serving stale data? For a stock trading platform, stale prices cause financial loss -- strong consistency is mandatory. For a news feed, showing a post 2 seconds late is imperceptible. Second: what is the business cost of unavailability? For an e-commerce checkout, every second of downtime is lost revenue. For an internal reporting dashboard, an hour of downtime is tolerable. Third: what is the acceptable inconsistency window? Some applications can tolerate seconds of staleness; others need sub-millisecond recency. Fourth: can you detect and resolve conflicts after the fact? If yes (e.g., shopping carts, collaborative editing), eventual consistency with conflict resolution is viable.
In system design interviews, demonstrating this nuanced understanding -- that you would use strong consistency for the payment service, eventual consistency for the recommendation engine, and causal consistency for the messaging layer -- shows architectural maturity. The best answers decompose the system into components and assign each one an appropriate consistency level with clear reasoning.
E-Commerce Platform: Three Components, Three Consistency Levels
Consider an e-commerce platform with three components. (1) Inventory count: needs strong consistency -- if two customers try to buy the last item, only one should succeed. Use a CP store (e.g., PostgreSQL with serializable isolation) and accept that during a database failover, checkout is briefly unavailable. (2) Product reviews: can be eventually consistent -- a new review appearing 5 seconds late is fine. Use an AP store (e.g., DynamoDB with eventually consistent reads) for high availability and low latency. (3) User's shopping cart: needs availability (never lose an item a user added) but can tolerate brief staleness across devices. Use an AP store with merge-on-read conflict resolution (union of items). This decomposition gives each component the right trade-off rather than forcing a single global choice.
Netflix
Netflix uses different consistency levels across its architecture. The billing and subscription service uses strongly consistent databases (CP) because incorrect billing causes customer complaints and legal issues. The video recommendation engine uses eventually consistent data -- stale viewing history for a few minutes has negligible impact on recommendation quality. The streaming metadata (which episode you are on, watch progress) uses read-your-writes consistency via sticky sessions to the same Cassandra replica, ensuring users see their own progress updates immediately.
Stripe
Stripe's payment processing pipeline demands strong consistency for the core ledger -- double-charges or lost payments are unacceptable. They use serializable transactions in PostgreSQL for balance mutations. However, the merchant dashboard (displaying transaction history, analytics) uses read replicas with eventual consistency, tolerating up to a few seconds of lag. The webhook delivery system is designed for availability: webhooks are fired at-least-once with idempotency keys, accepting potential duplicates (an AP behavior) rather than risking missed notifications.
Slack
Slack's message delivery uses a middle ground: causal consistency within each channel. Messages in a channel are causally ordered (if message B is a reply to message A, B always appears after A on all clients). However, messages across different channels may be seen in different orders by different users -- full linearizability across all channels would be prohibitively expensive. The workspace membership and permissions service uses stronger consistency (CP) to prevent unauthorized access, accepting brief unavailability during leader failovers.
| Aspect | Description |
|---|---|
| Stale Data Cost vs Downtime Cost | The fundamental decision criterion: which is more expensive for your business? For financial services, stale data (incorrect balances) causes regulatory and financial risk -- choose consistency. For user-facing content platforms, downtime causes user churn and revenue loss -- choose availability. Quantify both costs: if stale data costs $X per incident and downtime costs $Y per minute, the math drives the decision. |
| Latency vs Recency | Strongly consistent reads require a round-trip to the leader or quorum confirmation, adding 5-200ms depending on topology. For P99-sensitive APIs (search, feed rendering), this latency is unacceptable. For transactional APIs (payment, booking), the latency is acceptable because correctness outweighs speed. Measure your latency budget and choose the strongest consistency that fits within it. |
| Operational Complexity vs Consistency Guarantees | Strong consistency (CP) simplifies application logic -- developers do not need to handle stale reads or conflicts. But it complicates operations: leader elections, quorum management, and partition handling require expertise. Eventual consistency (AP) simplifies operations (any node can serve requests) but pushes complexity to the application layer (conflict resolution, idempotency, compensating transactions). |
| Horizontal Scalability vs Strong Consistency | AP systems scale almost linearly by adding nodes -- any node can serve any request. CP systems have scaling limits: all writes (and linearizable reads) must go through the leader, creating a throughput ceiling. Sharding helps but adds cross-shard coordination costs for multi-key operations. For read-heavy workloads at massive scale, eventual consistency with local reads is often the only practical option. |
Google Docs: Availability-First Collaborative Editing
Scenario
Google Docs needed to support real-time collaborative editing by multiple users simultaneously, including when users have intermittent connectivity (e.g., editing on a plane). A traditional CP approach -- locking the document during edits -- would make the system unusable for collaboration. But without coordination, concurrent edits could conflict and lose data.
Solution
Google Docs uses an availability-first design with Operational Transformation (OT), later evolving toward CRDT-like approaches. Each client applies edits locally and immediately (optimistic, available), then sends operations to the server for ordering and transformation. The server applies a total order and transforms concurrent operations so they converge to the same state. Clients see their own edits instantly (read-your-writes) while other users' edits arrive and are merged within milliseconds. Offline edits are queued and reconciled upon reconnection.
Outcome
Google Docs achieves near-perfect availability for editing -- users can always type, even offline. The consistency model is eventual convergence: all clients eventually see the same document state, but during concurrent editing there are brief windows where different clients see different intermediate states. This trade-off is invisible to users because the convergence window is typically under 100ms on a good connection. The approach enabled Google Docs to scale to billions of documents and become the dominant collaborative editing platform, demonstrating that availability-first design with smart conflict resolution outperforms strong consistency for collaboration workloads.
See Consistency vs Availability in action
Explore system design templates that use consistency vs availability and run traffic simulations to see how these concepts perform under real load.
Browse Templates1An e-commerce platform has a product catalog, a shopping cart, and a payment service. Which consistency assignment is most appropriate?
2Why is read-your-writes consistency often considered the minimum acceptable consistency level for user-facing applications?