Vetora logo
🔺Foundations

CAP Theorem

The CAP theorem states that a distributed data store can guarantee at most two of three properties -- Consistency, Availability, and Partition Tolerance -- during a network partition. Understanding CAP is essential for choosing the right trade-offs when designing distributed systems.

Overview

The CAP theorem, first conjectured by Eric Brewer in his 2000 PODC keynote and formally proven by Seth Gilbert and Nancy Lynch in 2002, establishes a fundamental impossibility result for distributed data stores. It states that any networked shared-data system can provide at most two of three guarantees simultaneously: Consistency (C), Availability (A), and Partition Tolerance (P). Because network partitions are inevitable in any distributed system, the practical choice during a partition is between consistency and availability.

Consistency in CAP means linearizability -- every read receives the most recent write or an error. This is stronger than the 'C' in ACID, which refers to database invariants. Availability means every request to a non-failing node receives a response, without guaranteeing it contains the latest write. Partition Tolerance means the system continues to operate despite arbitrary message loss or delay between nodes. A partition is not a theoretical edge case; it is a routine event in any system spanning multiple machines, racks, or data centers. Network cables are cut, switches fail, and cloud availability zones lose connectivity to each other.

The most common misconception about CAP is treating it as a permanent architectural choice -- 'pick 2 of 3.' In reality, partitions are rare events. When the network is healthy, a well-designed system can provide both consistency and availability. The theorem only forces a choice during an active partition. A CP system (like ZooKeeper) will reject writes or reads it cannot guarantee are consistent during a partition, sacrificing availability. An AP system (like DynamoDB in its default configuration) will continue serving requests during a partition, but different nodes may return different values for the same key, sacrificing consistency. After the partition heals, AP systems must reconcile divergent state, typically through conflict resolution strategies like last-writer-wins, vector clocks, or application-level merge functions.

The PACELC extension (proposed by Daniel Abadi in 2012) refines CAP by addressing system behavior when there is no partition. PACELC states: if there is a Partition, choose Availability or Consistency; Else, choose Latency or Consistency. This captures the reality that even in normal operation, replicating data synchronously (for consistency) adds latency compared to asynchronous replication (for lower latency). PACELC explains why systems like DynamoDB (PA/EL) and Cassandra (PA/EL) optimize for availability and latency, while Spanner (PC/EC) pays a latency cost for global consistency using TrueTime. Understanding PACELC helps designers make informed trade-offs for the 99.99% of the time when there is no partition, not just the rare failure scenarios CAP addresses.

Key Points
  • 1Consistency in CAP means linearizability: every read returns the most recent write or an error. This is different from ACID consistency, which refers to maintaining database invariants across transactions.
  • 2Availability means every request to a non-failing node receives a non-error response, though it may not reflect the most recent write. A node that returns stale data is available; one that returns an error is not.
  • 3Partition Tolerance is not optional in any real distributed system. Network partitions happen due to switch failures, cable cuts, cloud zone outages, and GC pauses. The practical choice is always between C and A during a partition.
  • 4CAP applies only during partitions. When the network is healthy, systems can and should provide both consistency and availability. Treating CAP as a permanent 2-of-3 choice is the most common misunderstanding.
  • 5CP systems (ZooKeeper, etcd, Spanner) refuse to serve requests they cannot guarantee are consistent during a partition. AP systems (DynamoDB, Cassandra) continue serving requests but may return stale or conflicting data.
  • 6The PACELC extension captures the latency-consistency trade-off during normal (non-partition) operation, making it more useful for day-to-day system design decisions than CAP alone.
Simple Example

The Two ATMs Problem

Imagine a bank with two ATMs connected by a network, both accessing the same account with a $500 balance. The network cable between them is cut (partition). A customer tries to withdraw $400 from ATM-A. The bank has two choices: (1) Allow the withdrawal -- the customer gets their money (available), but ATM-B still thinks the balance is $500, and a second $400 withdrawal there would overdraft the account (inconsistent). (2) Reject the withdrawal until the network is restored -- the account stays consistent, but the customer is denied service (unavailable). There is no option that gives both correct balances and uninterrupted service during the partition.

Real-World Examples

Amazon DynamoDB

DynamoDB is an AP system by default, prioritizing availability and partition tolerance. During a partition, DynamoDB continues accepting reads and writes on all reachable nodes using eventual consistency. Conflicts are resolved via last-writer-wins using vector clocks. DynamoDB also offers a strongly consistent read option that routes to the leader node, effectively providing CP behavior per-request at the cost of higher latency and reduced availability during partitions.

Google Spanner

Spanner is a CP system that provides globally consistent reads and writes using TrueTime, a GPS- and atomic-clock-synchronized time API. During a partition, Spanner will make affected data unavailable rather than serve inconsistent results. The trade-off is higher write latency (commits require cross-region Paxos consensus), but Spanner achieves 99.999% availability through redundant infrastructure, making the CP trade-off less painful in practice.

Apache ZooKeeper

ZooKeeper is a CP coordination service used for leader election, configuration management, and distributed locking. During a partition, if a ZooKeeper node is separated from the quorum (majority of nodes), it stops serving reads and writes to prevent returning stale coordination data. This CP choice is critical because stale leader-election data could cause split-brain, where two nodes both believe they are the leader.

Trade-Offs
AspectDescription
Consistency vs AvailabilityThe core CAP trade-off. CP systems reject requests during partitions to maintain correctness, which can mean downtime for affected users. AP systems continue serving requests but may return stale or conflicting data, requiring conflict resolution mechanisms and application-level compensation logic.
Latency Impact of CPMaintaining strong consistency in a distributed system typically requires synchronous replication or consensus protocols (Paxos, Raft). These add round-trip latency to every write and sometimes to reads. Cross-region consensus can add 50-200ms per operation, which may be unacceptable for latency-sensitive workloads.
Operational ComplexityAP systems require conflict resolution strategies (last-writer-wins, CRDTs, application-level merge) that add design and operational complexity. CP systems require quorum management, leader election, and careful partition detection. Neither choice eliminates complexity -- it shifts it to a different part of the system.
Application-Level CompensationAP systems push consistency responsibility to the application layer. Developers must handle read-repair, anti-entropy processes, and business-logic-level conflict resolution. For example, Amazon's shopping cart merges conflicting cart states by taking the union of items, preferring to show extra items over losing additions.
Case Study

Amazon's Dynamo Paper (2007) -- Shopping Cart Availability

Scenario

Amazon's e-commerce platform required that customers could always add items to their shopping cart, even during infrastructure failures and network partitions. A lost 'add to cart' operation directly translated to lost revenue. The existing relational database approach sacrificed availability for consistency, causing cart errors during peak traffic events like holiday sales.

Solution

Amazon built Dynamo, an AP key-value store that prioritized availability and partition tolerance over strong consistency. Dynamo uses consistent hashing for data partitioning, vector clocks for conflict detection, and a 'sloppy quorum' with hinted handoff to remain writable even when some nodes are unreachable. When conflicting versions of a shopping cart are detected (because writes occurred on different sides of a partition), the application merges them by taking the union of all items -- it is better to show an item the customer removed than to lose an item they added.

Outcome

Dynamo achieved 99.9th percentile read latency under 300ms and provided 'always-writable' availability during multiple infrastructure failures. The shopping cart error rate dropped by orders of magnitude. The Dynamo paper (published at SOSP 2007) became one of the most influential papers in distributed systems, directly inspiring open-source systems like Cassandra, Riak, and Voldemort, and influencing the design of DynamoDB.

Common Mistakes
  • Treating CAP as a permanent 2-of-3 choice. CAP only constrains behavior during an active network partition. During normal operation, systems can provide both consistency and availability. Design for the common case, not just the failure case.
  • Ignoring the PACELC extension. CAP says nothing about system behavior when there is no partition, which is the vast majority of the time. PACELC captures the latency-consistency trade-off that dominates day-to-day performance decisions.
  • Applying CAP to single-node systems. The CAP theorem is about distributed data stores with replication across network-connected nodes. A single PostgreSQL instance is not subject to CAP because there is no network partition to consider between replicas.
  • Confusing CAP consistency with other consistency models. CAP's 'C' means linearizability, which is the strongest consistency model. Weaker models like causal consistency, read-your-writes consistency, and eventual consistency each offer different guarantees and are not directly addressed by the CAP theorem.
Related Concepts

See CAP Theorem in action

Explore system design templates that use cap theorem and run traffic simulations to see how these concepts perform under real load.

Browse Templates

See how partition tolerance affects cache consistency

Metrics to watch
consistency_violationsavailability_pct
Run Simulation
Test Your Understanding

1What does the CAP theorem state about distributed data stores during a network partition?

2A distributed cache continues serving stale data when some nodes cannot communicate. Which CAP trade-off has it made?

3Which real-world system is classified as CP, using synchronized clocks for global consistency?

Deeper Reading