Vetora logo
⚖️Foundations

PACELC Theorem

PACELC extends the CAP theorem to cover system behavior when there is no partition: during a Partition choose Availability or Consistency; Else choose Latency or Consistency. This model captures the trade-offs that dominate everyday distributed system design.

Overview

The PACELC theorem, proposed by Daniel Abadi in 2012, extends the CAP theorem to address a critical gap: what trade-offs does a distributed system face when there is no partition? CAP tells us that during a partition we must choose between availability and consistency, but partitions are rare events. For the vast majority of operating time, the system is healthy -- and during that time, the dominant trade-off is between latency and consistency. PACELC captures both scenarios in a single framework: if Partition, choose A or C; Else, choose L or C.

The 'Else' clause is where PACELC becomes genuinely useful for system design. Synchronous replication guarantees that all replicas have the same data before acknowledging a write, providing strong consistency (EC) but adding network round-trip latency to every operation. For a system replicating across regions, this can mean 50-200ms added to every write. Asynchronous replication acknowledges writes immediately on the primary and propagates changes in the background, providing low latency (EL) but allowing replicas to serve stale data during the replication window. This latency-consistency trade-off affects every single request, not just requests during partitions, making it far more impactful on user experience and system performance.

PACELC classifies systems into four categories based on their choices in both scenarios. A PA/EL system (like DynamoDB or Cassandra in default configuration) chooses availability during partitions and low latency during normal operation -- it optimizes for responsiveness at the cost of consistency in both cases. A PC/EC system (like Google Spanner) chooses consistency in both scenarios -- it will sacrifice availability during partitions and accept higher latency during normal operation to guarantee linearizable reads and writes. A PA/EC system (like MongoDB with majority write concern) provides availability during partitions but enforces consistency during normal operation. These classifications are more descriptive than CAP's binary CP/AP labels because they capture the full spectrum of design decisions.

Understanding PACELC changes how you approach system design conversations. Instead of asking 'is this system CP or AP?' -- which only matters during rare failure events -- you ask 'what latency am I paying for consistency during normal operation?' For a payment processing system, paying 50ms extra per transaction for global consistency (PC/EC) is worthwhile because incorrect balances are unacceptable. For a social media feed, serving slightly stale content with sub-10ms reads (PA/EL) is the right choice because users will not notice a post appearing 500ms late, but they will notice a 200ms page load. PACELC makes these everyday design decisions explicit.

Key Points
  • 1PACELC extends CAP by adding the Else clause: when there is no partition, the trade-off shifts to Latency versus Consistency. This covers the 99.99% of operating time that CAP ignores.
  • 2PA/EL systems (DynamoDB, Cassandra) optimize for availability during partitions and low latency during normal operation. They accept eventual consistency as the cost of responsiveness.
  • 3PC/EC systems (Spanner, CockroachDB) enforce consistency in all scenarios. They sacrifice availability during partitions and accept higher latency during normal operation for synchronous replication.
  • 4PA/EC systems (MongoDB with majority concern) blend approaches: available during partitions but consistent during normal operation. This provides a useful middle ground for many workloads.
  • 5The 'EL' choice is not about being sloppy with data -- it is a deliberate architectural decision that low latency matters more than perfect consistency for a specific workload, with mechanisms to converge to consistency eventually.
  • 6PACELC classifications are not fixed properties of a database product. Many systems (Cassandra, DynamoDB, MongoDB) allow per-query or per-table consistency tuning, letting developers choose different PACELC trade-offs for different operations.
Simple Example

The News Wire Analogy

Consider a global news wire service with offices in New York and London. During normal operation (Else clause), the editors face a choice: publish stories instantly to the local office and sync later (EL -- readers see news faster but the two offices may briefly show different versions), or wait for both offices to confirm they have the same version before publishing (EC -- slower but globally consistent). During a transatlantic cable outage (Partition), they choose between: keep publishing locally and merge later (PA -- available but inconsistent), or halt publication until the link is restored (PC -- consistent but unavailable). PACELC captures both decisions: the news wire is PA/EL if it prioritizes speed in both scenarios.

Real-World Examples

Amazon DynamoDB

DynamoDB is classified as PA/EL. During partitions, it continues accepting reads and writes on all reachable nodes (PA), resolving conflicts via last-writer-wins. During normal operation, it uses asynchronous replication by default for single-digit-millisecond reads (EL). The optional strongly consistent read mode shifts individual reads to EC at the cost of higher latency and reduced throughput, demonstrating per-request PACELC tuning.

Google Spanner

Spanner is classified as PC/EC. During partitions, affected Paxos groups become unavailable until quorum is restored (PC). During normal operation, every write requires cross-replica Paxos consensus via TrueTime, adding measurable latency (EC). Spanner accepts this latency cost because its target workloads -- financial transactions, inventory management -- require globally consistent reads. The TrueTime infrastructure keeps this latency manageable (typically 10-15ms for cross-region commits).

Apache Cassandra

Cassandra is PA/EL by default but offers tunable consistency per query. With consistency level ONE, reads return data from the nearest replica (PA/EL -- fast but potentially stale). With consistency level QUORUM, reads wait for a majority of replicas to agree (moving toward EC behavior). With ALL, every replica must respond (fully EC). This per-query tuning lets developers choose different PACELC trade-offs for different operations within the same cluster.

Trade-Offs
AspectDescription
Latency vs Consistency (Normal Operation)The central PACELC trade-off during non-partition operation. Synchronous replication (EC) adds one or more network round trips to every write. For cross-region replication, this can be 50-200ms per operation. Asynchronous replication (EL) completes writes in under 5ms but allows a consistency window where replicas may serve stale data.
Per-Query Tuning ComplexitySystems that allow per-query consistency levels (Cassandra, DynamoDB) provide flexibility but shift the consistency decision to application developers. Each query must be annotated with the correct consistency level, and mistakes can cause subtle bugs -- reading with weak consistency right after a strong-consistency write may still return stale data if the read hits a different replica.
Replication Topology ChoicesThe EL/EC choice influences replication topology. EL systems often use multi-leader or leaderless replication for lower latency. EC systems typically use single-leader replication with synchronous followers. Multi-leader replication provides lower write latency but requires conflict resolution, adding application complexity.
Tail Latency ImpactEC systems pay a consistency tax on every operation, which directly impacts tail latency (p99, p99.9). When a synchronous replica is slow due to GC pauses or disk I/O, the write latency is dominated by the slowest replica. EL systems avoid this because writes do not wait for replica acknowledgment, resulting in more predictable latency distributions.
Case Study

Cassandra Tunable Consistency at Netflix

Scenario

Netflix operates one of the largest Cassandra deployments in the world, storing hundreds of terabytes across multiple AWS regions. Different data types have drastically different consistency requirements: a user's viewing history needs to be available across regions immediately (for personalization), while billing records require strong consistency to avoid duplicate charges. A single PA/EL or PC/EC classification could not serve both workloads.

Solution

Netflix leverages Cassandra's tunable consistency to apply different PACELC trade-offs per data type. Viewing history and personalization data use consistency level LOCAL_ONE (PA/EL), reading from the nearest replica for sub-5ms latency. Critical account data uses LOCAL_QUORUM (moving toward EC), ensuring a majority of local replicas agree before returning. Cross-region replication runs asynchronously for all data, but a custom reconciliation service detects and resolves conflicts for sensitive data within seconds.

Outcome

The tunable approach allowed Netflix to serve personalization recommendations with sub-10ms p99 latency while maintaining strong consistency for billing and account operations. During AWS region failures (partitions), the PA behavior ensured that streaming and recommendations continued operating from the surviving region, while billing operations were briefly paused (PC behavior applied selectively) until the partition healed. This hybrid PACELC strategy avoided the false choice of a single consistency model for the entire platform.

Common Mistakes
  • Ignoring the Else clause and only reasoning about partition behavior. Partitions are rare; the latency-consistency trade-off during normal operation affects every request and dominates system performance. Design for the Else case first.
  • Assuming a database's PACELC classification is fixed. Many systems offer tunable consistency per query, per table, or per keyspace. DynamoDB, Cassandra, and MongoDB all allow developers to choose different trade-offs for different operations.
  • Choosing EC (strong consistency) everywhere out of caution. The latency cost of synchronous replication is real and compounding -- a request that touches five services, each adding 20ms for EC, adds 100ms to total request latency. Apply EC only where correctness requires it.
  • Conflating replication lag with inconsistency. A 50ms replication lag in an EL system does not mean the system is 'inconsistent' -- it means there is a brief window where replicas may serve slightly stale data. For most workloads, this window is imperceptible to users.
Related Concepts

See PACELC Theorem in action

Explore system design templates that use pacelc theorem and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Observe latency-consistency trade-offs under partition

Metrics to watch
replication_lag_msp99_latency_msconsistency_violationsthroughput_rps
Run Simulation
Test Your Understanding

1What does the 'EL' in a PA/EL system classification mean?

2Google Spanner is classified as PC/EC. What does this mean in practice?

3Why is PACELC considered more useful than CAP for day-to-day system design?

Deeper Reading