What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "Raft/Paxos consensus replication ensures that committed writ..."?

Raft/Paxos consensus replication ensures that committed writes survive any minority of node failures and that all replicas agree on operation order. This provides serializable isolation in a distributed system, something traditional async replication cannot guarantee.

What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "Range-based sharding divides data into contiguous key ranges..."?

Range-based sharding divides data into contiguous key ranges, each managed by a consensus group. Unlike hash-based sharding, this preserves key ordering for efficient range scans and ordered queries. Ranges automatically split and rebalance as data grows.

What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "Distributed two-phase commit (2PC) enables ACID transactions..."?

Distributed two-phase commit (2PC) enables ACID transactions that span multiple ranges (shards). The transaction record is itself replicated via consensus, so coordinator failure does not cause stuck transactions -- a critical improvement over traditional 2PC.

What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "Google Spanner's TrueTime uses GPS and atomic clock synchron..."?

Google Spanner's TrueTime uses GPS and atomic clock synchronization to assign globally unique timestamps, enabling externally consistent reads without explicit coordination. CockroachDB's Hybrid Logical Clocks (HLC) achieve a similar result without specialized hardware.

What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "Write latency is the primary trade-off: consensus requires m..."?

Write latency is the primary trade-off: consensus requires majority acknowledgment, adding 5-15ms per write within a region and 50-200ms across regions. This is acceptable for transactional workloads but makes NewSQL a poor fit for write-heavy, latency-sensitive workloads like real-time analytics or IoT ingestion.

What is important about NewSQL (CockroachDB, Spanner, YugabyteDB) regarding "NewSQL databases are PostgreSQL-compatible (CockroachDB and ..."?

NewSQL databases are PostgreSQL-compatible (CockroachDB and YugabyteDB use the PostgreSQL wire protocol), enabling migration from PostgreSQL without rewriting application SQL. This compatibility dramatically reduces the barrier to adoption for teams already using PostgreSQL.

Vetora

🆕Database Families

NewSQL (CockroachDB, Spanner, YugabyteDB)

NewSQL databases provide the SQL interface and ACID transactions of relational databases with the horizontal scalability of NoSQL systems. Using distributed consensus protocols like Raft and Paxos, they achieve serializable isolation across distributed nodes, bridging the gap between consistency and scale.

Overview

NewSQL emerged as a response to a false dichotomy that dominated database architecture from roughly 2008 to 2015: the belief that you must sacrifice ACID transactions to achieve horizontal scalability. During the NoSQL era, systems like Cassandra, DynamoDB, and MongoDB gained adoption by trading consistency for availability and scalability. But many critical workloads -- financial systems, inventory management, multi-tenant SaaS -- genuinely need both strong consistency and horizontal scale. NewSQL databases prove that this combination is achievable by applying distributed consensus protocols to the relational model.

The foundational innovation in NewSQL is the use of consensus protocols (Raft in CockroachDB and YugabyteDB, Paxos in Spanner) for data replication instead of the traditional leader-follower asynchronous replication used by PostgreSQL and MySQL. In a NewSQL database, data is divided into ranges (or tablets), and each range is replicated across multiple nodes using a consensus group. A write must be acknowledged by a majority of replicas in the consensus group before it is considered committed. This guarantees that committed data survives any minority of node failures and that all replicas agree on the order of operations -- providing strong consistency (serializable isolation) even in a distributed environment.

Data distribution in NewSQL databases uses range-based sharding: the key space is divided into contiguous ranges, and each range is assigned to a consensus group of nodes. Unlike hash-based sharding (used by Cassandra and DynamoDB), range-based sharding preserves key ordering, enabling efficient range scans and ordered queries within and across ranges. Ranges automatically split when they grow too large and are rebalanced across nodes to maintain even distribution. CockroachDB and YugabyteDB add automatic rebalancing based on load, moving hot ranges to less-loaded nodes. This automatic sharding and rebalancing means that developers interact with what appears to be a single logical SQL database, while the system transparently distributes data across dozens or hundreds of nodes.

Distributed transactions across multiple ranges use a variant of two-phase commit (2PC) coordinated by the transaction record. The key insight that makes NewSQL's 2PC reliable (unlike traditional 2PC which is blocking and fragile) is that the transaction record itself is replicated via consensus -- if the coordinator fails, another node can take over and complete or abort the transaction. Google Spanner adds TrueTime, a globally synchronized clock based on GPS and atomic clocks, to provide external consistency (stronger than serializable isolation) without requiring explicit coordination for read-only transactions. CockroachDB achieves a similar guarantee using Hybrid Logical Clocks (HLC) without specialized hardware, at the cost of occasional clock-uncertainty wait times.

The primary trade-off of NewSQL is write latency. Every write must achieve consensus across multiple replicas, adding network round-trip time. For a CockroachDB cluster in a single region, this adds 5-15ms per write. For a Spanner deployment spanning continents, cross-region consensus can add 50-200ms. This is significantly higher than a single-node PostgreSQL write (sub-1ms). However, for workloads that need both strong consistency and horizontal scale, this latency cost is the price of correctness -- and it is often acceptable when the alternative is building application-level consistency on top of an eventually-consistent database, which is both complex and error-prone.

Key Points

1Raft/Paxos consensus replication ensures that committed writes survive any minority of node failures and that all replicas agree on operation order. This provides serializable isolation in a distributed system, something traditional async replication cannot guarantee.
2Range-based sharding divides data into contiguous key ranges, each managed by a consensus group. Unlike hash-based sharding, this preserves key ordering for efficient range scans and ordered queries. Ranges automatically split and rebalance as data grows.
3Distributed two-phase commit (2PC) enables ACID transactions that span multiple ranges (shards). The transaction record is itself replicated via consensus, so coordinator failure does not cause stuck transactions -- a critical improvement over traditional 2PC.
4Google Spanner's TrueTime uses GPS and atomic clock synchronization to assign globally unique timestamps, enabling externally consistent reads without explicit coordination. CockroachDB's Hybrid Logical Clocks (HLC) achieve a similar result without specialized hardware.
5Write latency is the primary trade-off: consensus requires majority acknowledgment, adding 5-15ms per write within a region and 50-200ms across regions. This is acceptable for transactional workloads but makes NewSQL a poor fit for write-heavy, latency-sensitive workloads like real-time analytics or IoT ingestion.
6NewSQL databases are PostgreSQL-compatible (CockroachDB and YugabyteDB use the PostgreSQL wire protocol), enabling migration from PostgreSQL without rewriting application SQL. This compatibility dramatically reduces the barrier to adoption for teams already using PostgreSQL.

Simple Example

The Notarized Multi-Branch Bank Analogy

Imagine a bank with branches in New York, London, and Tokyo that all share the same account system. In a traditional bank, each branch has its own copy of the account ledger, and they sync overnight -- if you deposit in New York and check your balance in London before the sync, you see the old balance (eventual consistency). A NewSQL bank works differently: every transaction is notarized by a majority of branches before it is confirmed. When you deposit $100 in New York, the deposit is not confirmed until at least two of three branches (New York + London, or New York + Tokyo) acknowledge it. This means your balance is correct at every branch at all times, but the deposit takes a few extra seconds for the cross-branch confirmation. That delay is the consensus latency -- the price of global consistency.

Real-World Examples

Google Spanner

Spanner is Google's globally distributed NewSQL database, powering Google Ads, Google Play, and core infrastructure. Its defining feature is TrueTime, which uses GPS receivers and atomic clocks in every data center to provide globally synchronized timestamps with bounded uncertainty (typically less than 7ms). This enables externally consistent reads without requiring explicit distributed locking. Spanner provides 99.999% availability (five nines) through its Paxos-replicated architecture and has been in production since 2012.

CockroachDB (DoorDash)

DoorDash migrated from PostgreSQL to CockroachDB to support their rapidly growing order processing system across multiple regions. Single-node PostgreSQL could not handle the write throughput during peak ordering hours, and manual sharding would have required significant application changes. CockroachDB provided PostgreSQL-compatible SQL with automatic sharding and rebalancing. DoorDash's migration preserved all existing SQL queries and ORM code while gaining horizontal write scaling and multi-region resilience.

YugabyteDB (Kroger)

Kroger, one of the largest grocery chains in the US, uses YugabyteDB for real-time inventory management across thousands of stores. Each store's inventory data is stored in a YugabyteDB cluster that spans multiple data centers, providing strong consistency for inventory counts (preventing overselling) and automatic failover if a data center goes down. YugabyteDB's PostgreSQL compatibility allowed Kroger to reuse existing inventory management queries and tools.

Trade-Offs

Aspect	Description
Write Latency vs Strong Consistency	Every write requires consensus across multiple replicas (majority must acknowledge), adding 5-15ms within a region and 50-200ms across regions compared to sub-1ms for single-node PostgreSQL. This is the fundamental cost of distributed strong consistency. Applications must be designed to tolerate this latency, and hot-path operations should minimize cross-region writes.
Operational Simplicity vs Hardware Requirements	NewSQL databases simplify application development (standard SQL, automatic sharding) but require a minimum of 3 nodes for consensus (5 nodes for multi-region). The infrastructure cost is higher than a single PostgreSQL instance, and operational expertise for consensus-based systems differs from traditional database administration.
PostgreSQL Compatibility vs Feature Completeness	CockroachDB and YugabyteDB implement the PostgreSQL wire protocol and support most PostgreSQL SQL syntax, but they are not 100% compatible. Some PostgreSQL features (stored procedures, certain extensions, specific isolation level behaviors) may not be fully supported. Migration requires testing all queries and ORM interactions against the NewSQL database.
Scalability vs Single-Node Performance	For workloads that fit on a single well-provisioned server (128 cores, 1 TB RAM, NVMe), single-node PostgreSQL will outperform any NewSQL database due to zero network overhead for consensus. NewSQL's advantage materializes only when you need more write throughput, storage, or fault tolerance than a single node can provide.

Case Study

DoorDash Migration from PostgreSQL to CockroachDB

Scenario

DoorDash's order processing system ran on a single PostgreSQL instance that was reaching its limits during peak dinner hours. Write throughput saturated at 20,000 writes per second, and vertical scaling had reached the limits of available hardware. Manual application-level sharding would have required rewriting the order service, splitting transactions across shards, and building a custom routing layer -- a multi-quarter engineering project.

Solution

DoorDash migrated to CockroachDB, which provided PostgreSQL-compatible SQL with automatic range-based sharding. The migration preserved all existing SQL queries, Active Record ORM code, and database migration tooling. CockroachDB automatically distributed order data across nodes by primary key ranges, and the Raft consensus protocol ensured that order transactions maintained serializable isolation across shards. The order_id was used as the primary key to ensure even data distribution.

Outcome

DoorDash's order system scaled from 20,000 to over 100,000 writes per second by adding CockroachDB nodes, with linear scaling characteristics. Write latency increased from sub-1ms (single PostgreSQL) to 8-12ms (CockroachDB consensus), which was acceptable for order processing. The migration was completed in weeks rather than quarters because no application SQL was rewritten. Multi-region replication ensured that a data center failure would not interrupt order processing, a capability that would have been extremely complex to build on top of sharded PostgreSQL.

Common Mistakes

⚠Adopting NewSQL before needing it. If your workload fits comfortably on a single PostgreSQL instance (most applications do), NewSQL adds latency, operational complexity, and cost without benefit. Start with PostgreSQL and migrate to NewSQL when you genuinely cannot scale vertically any further.
⚠Expecting single-node PostgreSQL performance from a distributed NewSQL database. Consensus-based writes inherently add network round-trip latency. Design your application to batch writes where possible and avoid tight loops that issue sequential single-row writes.
⚠Ignoring data locality for multi-region deployments. Without pinning data to the region where it is most frequently accessed (using zone configurations or geo-partitioning), every read may cross regions, adding unnecessary latency. Configure locality settings to keep data near its primary users.
⚠Assuming 100% PostgreSQL compatibility. While CockroachDB and YugabyteDB support most PostgreSQL syntax, some features (specific pg_catalog views, certain extensions, advisory locks, temp tables in some modes) behave differently or are unsupported. Run comprehensive integration tests before migrating.

Related Concepts

ACID vs BASE Database Sharding Quorums Leader-Follower Replication PACELC Theorem

See NewSQL (CockroachDB, Spanner, YugabyteDB) in action

Explore system design templates that use newsql (cockroachdb, spanner, yugabytedb) and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Compare NewSQL distributed transactions vs single-node relational performance

Metrics to watch

write_latency_p99transactions_per_secondreplication_lag

Run Simulation

Test Your Understanding

1What fundamental problem does NewSQL solve that NoSQL databases do not?

2Why does CockroachDB use range-based sharding instead of hash-based sharding?

3What is the primary trade-off when migrating from single-node PostgreSQL to CockroachDB?

Deeper Reading