Vetora logo
🎯Trade-Off Deep Dives

Consistency vs Availability

The consistency vs availability trade-off is the most consequential decision in distributed system design. This concept provides a practical decision framework for choosing where on the consistency-availability spectrum each component of your system should sit, with concrete criteria for financial, e-commerce, social, and real-time systems.

Overview

While the CAP theorem establishes the theoretical impossibility of having both strong consistency and high availability during network partitions, the practical engineering challenge is deciding where each component of your system should sit on the consistency-availability spectrum. This is not a single global decision -- a well-designed system uses different consistency levels for different data and operations. The bank balance needs linearizability; the user's display name can be eventually consistent; the analytics pipeline can tolerate hours of lag.

The key insight is that consistency and availability are not binary states but continuous spectra. On the consistency side, the spectrum runs from linearizability (strongest) through sequential consistency, causal consistency, read-your-writes, monotonic reads, to eventual consistency (weakest). On the availability side, systems range from 99.0% (87 hours downtime/year) through 99.99% (52 minutes/year) to 99.999% (5 minutes/year). Moving along either axis has cost implications: higher consistency requires more coordination (latency, throughput overhead), while higher availability requires more replicas and more sophisticated failover.

A practical decision framework for choosing the right point on the spectrum involves four questions. First: what is the business cost of serving stale data? For a stock trading platform, stale prices cause financial loss -- strong consistency is mandatory. For a news feed, showing a post 2 seconds late is imperceptible. Second: what is the business cost of unavailability? For an e-commerce checkout, every second of downtime is lost revenue. For an internal reporting dashboard, an hour of downtime is tolerable. Third: what is the acceptable inconsistency window? Some applications can tolerate seconds of staleness; others need sub-millisecond recency. Fourth: can you detect and resolve conflicts after the fact? If yes (e.g., shopping carts, collaborative editing), eventual consistency with conflict resolution is viable.

In system design interviews, demonstrating this nuanced understanding -- that you would use strong consistency for the payment service, eventual consistency for the recommendation engine, and causal consistency for the messaging layer -- shows architectural maturity. The best answers decompose the system into components and assign each one an appropriate consistency level with clear reasoning.

Key Points
  • 1Consistency and availability are spectra, not binary choices. The consistency spectrum ranges from linearizability to eventual consistency, with useful intermediate points like causal consistency and read-your-writes. Choose the weakest consistency level that satisfies your domain requirements.
  • 2Different components in the same system should use different consistency levels. A payment ledger needs linearizability; a product catalog can be eventually consistent; a user session store needs read-your-writes consistency. Decompose your system and decide per-component.
  • 3The cost of inconsistency is domain-specific. Stale data in a stock ticker causes financial loss. Stale data in a social feed causes a mildly delayed post. Stale data in a DNS cache is expected behavior. Quantify the business cost of stale data for each component.
  • 4Availability requirements compound across dependencies. If service A (99.9%) depends on service B (99.9%), the combined availability is at most 99.8%. Critical paths should minimize synchronous dependencies on strongly consistent stores, which have lower availability during partitions.
  • 5Conflict resolution strategies enable AP designs. Last-write-wins (LWW) is simple but loses data. Merge functions (union for sets, max for counters) preserve data. CRDTs provide automatic, mathematically guaranteed conflict resolution for specific data types. Application-level resolution (e.g., showing conflicts to the user) is always an option.
  • 6Read-your-writes consistency is often the practical minimum. Users tolerate seeing other users' stale data but are confused when their own writes disappear. Sticky sessions or session-scoped read-after-write guarantees provide this without full linearizability.
Simple Example

E-Commerce Platform: Three Components, Three Consistency Levels

Consider an e-commerce platform with three components. (1) Inventory count: needs strong consistency -- if two customers try to buy the last item, only one should succeed. Use a CP store (e.g., PostgreSQL with serializable isolation) and accept that during a database failover, checkout is briefly unavailable. (2) Product reviews: can be eventually consistent -- a new review appearing 5 seconds late is fine. Use an AP store (e.g., DynamoDB with eventually consistent reads) for high availability and low latency. (3) User's shopping cart: needs availability (never lose an item a user added) but can tolerate brief staleness across devices. Use an AP store with merge-on-read conflict resolution (union of items). This decomposition gives each component the right trade-off rather than forcing a single global choice.

Real-World Examples

Netflix

Netflix uses different consistency levels across its architecture. The billing and subscription service uses strongly consistent databases (CP) because incorrect billing causes customer complaints and legal issues. The video recommendation engine uses eventually consistent data -- stale viewing history for a few minutes has negligible impact on recommendation quality. The streaming metadata (which episode you are on, watch progress) uses read-your-writes consistency via sticky sessions to the same Cassandra replica, ensuring users see their own progress updates immediately.

Stripe

Stripe's payment processing pipeline demands strong consistency for the core ledger -- double-charges or lost payments are unacceptable. They use serializable transactions in PostgreSQL for balance mutations. However, the merchant dashboard (displaying transaction history, analytics) uses read replicas with eventual consistency, tolerating up to a few seconds of lag. The webhook delivery system is designed for availability: webhooks are fired at-least-once with idempotency keys, accepting potential duplicates (an AP behavior) rather than risking missed notifications.

Slack

Slack's message delivery uses a middle ground: causal consistency within each channel. Messages in a channel are causally ordered (if message B is a reply to message A, B always appears after A on all clients). However, messages across different channels may be seen in different orders by different users -- full linearizability across all channels would be prohibitively expensive. The workspace membership and permissions service uses stronger consistency (CP) to prevent unauthorized access, accepting brief unavailability during leader failovers.

Trade-Offs
AspectDescription
Stale Data Cost vs Downtime CostThe fundamental decision criterion: which is more expensive for your business? For financial services, stale data (incorrect balances) causes regulatory and financial risk -- choose consistency. For user-facing content platforms, downtime causes user churn and revenue loss -- choose availability. Quantify both costs: if stale data costs $X per incident and downtime costs $Y per minute, the math drives the decision.
Latency vs RecencyStrongly consistent reads require a round-trip to the leader or quorum confirmation, adding 5-200ms depending on topology. For P99-sensitive APIs (search, feed rendering), this latency is unacceptable. For transactional APIs (payment, booking), the latency is acceptable because correctness outweighs speed. Measure your latency budget and choose the strongest consistency that fits within it.
Operational Complexity vs Consistency GuaranteesStrong consistency (CP) simplifies application logic -- developers do not need to handle stale reads or conflicts. But it complicates operations: leader elections, quorum management, and partition handling require expertise. Eventual consistency (AP) simplifies operations (any node can serve requests) but pushes complexity to the application layer (conflict resolution, idempotency, compensating transactions).
Horizontal Scalability vs Strong ConsistencyAP systems scale almost linearly by adding nodes -- any node can serve any request. CP systems have scaling limits: all writes (and linearizable reads) must go through the leader, creating a throughput ceiling. Sharding helps but adds cross-shard coordination costs for multi-key operations. For read-heavy workloads at massive scale, eventual consistency with local reads is often the only practical option.
Case Study

Google Docs: Availability-First Collaborative Editing

Scenario

Google Docs needed to support real-time collaborative editing by multiple users simultaneously, including when users have intermittent connectivity (e.g., editing on a plane). A traditional CP approach -- locking the document during edits -- would make the system unusable for collaboration. But without coordination, concurrent edits could conflict and lose data.

Solution

Google Docs uses an availability-first design with Operational Transformation (OT), later evolving toward CRDT-like approaches. Each client applies edits locally and immediately (optimistic, available), then sends operations to the server for ordering and transformation. The server applies a total order and transforms concurrent operations so they converge to the same state. Clients see their own edits instantly (read-your-writes) while other users' edits arrive and are merged within milliseconds. Offline edits are queued and reconciled upon reconnection.

Outcome

Google Docs achieves near-perfect availability for editing -- users can always type, even offline. The consistency model is eventual convergence: all clients eventually see the same document state, but during concurrent editing there are brief windows where different clients see different intermediate states. This trade-off is invisible to users because the convergence window is typically under 100ms on a good connection. The approach enabled Google Docs to scale to billions of documents and become the dominant collaborative editing platform, demonstrating that availability-first design with smart conflict resolution outperforms strong consistency for collaboration workloads.

Common Mistakes
  • Applying a single consistency level to the entire system. A common junior mistake is declaring 'we will use strong consistency everywhere' (over-engineering, latency problems) or 'eventual consistency everywhere' (correctness bugs in financial flows). Decompose the system and assign consistency per-component based on domain requirements.
  • Ignoring the read-your-writes requirement. Users accept seeing other users' data with a delay, but they do not accept their own writes disappearing. Forgetting to provide at least read-your-writes consistency (via sticky sessions or session tokens) leads to confused users who think their actions did not save.
  • Assuming eventual consistency means 'consistency later.' Eventual consistency only guarantees convergence if no new writes occur. In practice, with continuous writes, replicas may never fully converge. What matters is the inconsistency window -- how long reads may be stale -- and whether it is acceptable for your use case.
  • Not considering conflict resolution strategy when choosing AP. Choosing availability without a conflict resolution plan leads to data loss or corruption. Before choosing AP, decide: will you use LWW (simple but lossy), merge functions (preserving but domain-specific), CRDTs (automatic but limited data types), or application-level resolution (flexible but complex)?
Related Concepts

See Consistency vs Availability in action

Explore system design templates that use consistency vs availability and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Tune consistency level and measure availability impact

Metrics to watch
consistency_levelavailability_pctread_latency_mswrite_latency_ms
Run Simulation
Test Your Understanding

1An e-commerce platform has a product catalog, a shopping cart, and a payment service. Which consistency assignment is most appropriate?

2Why is read-your-writes consistency often considered the minimum acceptable consistency level for user-facing applications?

Deeper Reading