1A user deposits money into their bank account and immediately checks their balance. The balance shows the old amount. What read consistency pattern would fix this?
Discover how read replicas scale database read throughput by distributing queries across multiple copies of the data, enabling read-heavy applications to serve millions of users.
Read replicas are copies of a primary (leader) database that receive real-time updates through replication and serve read queries. The primary database handles all write operations (INSERT, UPDATE, DELETE), and these changes are asynchronously replicated to one or more read replicas. Read-heavy application queries are then directed to the replicas instead of the primary, distributing the read load across multiple database instances.
Most production workloads are read-heavy. A social media platform might have a 100:1 read-to-write ratio -- for every post created, it is read by hundreds or thousands of followers. An e-commerce catalog is updated a few times per day but browsed by millions of shoppers. For these workloads, the primary database's CPU, memory, and I/O are consumed primarily by read queries. Adding read replicas linearly scales read throughput: two replicas can handle roughly 2x the read traffic of a single database, three replicas handle 3x, and so on.
Replication can be synchronous or asynchronous. Synchronous replication guarantees that a write is committed on both the primary and the replica before acknowledging the write to the client. This provides strong consistency but increases write latency because each write must wait for the replica to confirm. Asynchronous replication commits the write on the primary immediately and sends it to replicas in the background. This minimizes write latency but introduces replication lag: the replica may be a few milliseconds to a few seconds behind the primary, meaning reads from the replica may return slightly stale data.
Managing replication lag is the central challenge of read replica architectures. Applications must decide which queries can tolerate stale data (catalog browsing, dashboard aggregations, search results) and which require strong consistency (account balance checks, order status immediately after placement). The common pattern is to route reads that require fresh data to the primary and all other reads to replicas, sometimes called 'read-your-writes consistency' when the application ensures that a user who just performed a write sees their own change on the next read.
The Copy Center Analogy
Think of a popular textbook that every student in a university needs. Instead of having everyone line up at the single original (primary database), the university prints several copies and distributes them to different libraries across campus (read replicas). Students can read any copy at their nearest library without waiting. When the author releases an updated edition, the original is updated first, and new copies are distributed to the libraries (replication). There is a short period where some libraries still have the old edition (replication lag), but for most purposes, the content is close enough. If a student needs to verify the absolute latest version, they go to the main library where the original is kept (reading from primary).
Amazon RDS
Amazon's Relational Database Service supports up to 15 read replicas for MySQL and PostgreSQL, and up to 5 for Oracle. Amazon.com itself uses read replicas extensively for its product catalog -- the catalog is updated by sellers and internal systems (writes to primary), but browsed by millions of shoppers (reads from replicas across multiple regions). Aurora's replication has sub-10ms lag, making it suitable for most read-after-write use cases.
Facebook operates one of the largest MySQL deployments in the world, with hundreds of read replicas per primary database. Their TAO (The Associations and Objects) caching layer sits in front of the replicas, further reducing database load. The architecture handles trillions of read queries per day. Replication lag is managed through a custom consistency protocol that ensures users see their own recent actions immediately.
GitHub
GitHub uses MySQL with multiple read replicas per shard. When a developer pushes code or opens a pull request (write), the change goes to the primary. When other developers browse repositories, view diffs, or search code (reads), these queries are distributed across replicas. GitHub monitors replication lag in real time and will redirect reads to the primary if a replica falls too far behind, ensuring users do not see confusingly stale data.
| Aspect | Description |
|---|---|
| Read Scalability vs Consistency | Asynchronous read replicas provide linear read scalability but serve stale data during replication lag. Synchronous replicas provide consistent reads but increase write latency and reduce write throughput. The choice depends on whether the application can tolerate eventual consistency for read queries. |
| Infrastructure Cost vs Performance | Each read replica is a full database instance requiring compute, memory, and storage resources. The cost scales linearly with the number of replicas. For workloads with moderate read traffic, a single database with proper indexing and caching may be more cost-effective than maintaining multiple replicas. |
| Operational Overhead | Each replica must be monitored for replication lag, disk space, query performance, and health. Schema migrations must be coordinated across all replicas. Replica promotion for failover must be tested and automated. The operational burden grows with each additional replica. |
| Query Routing Complexity | The application or a database proxy must decide which queries go to the primary and which go to replicas. This routing logic must handle edge cases like read-your-writes consistency, transaction isolation, and replica failover. Incorrect routing can lead to stale reads or unnecessary primary load. |
Shopify's Read Replica Strategy for Black Friday Traffic
Scenario
Shopify hosts over 2 million online stores that collectively experience a massive traffic spike during Black Friday and Cyber Monday. The read-to-write ratio during this period exceeds 200:1 as millions of shoppers browse product pages, check inventory, and view cart totals, while the much smaller number of actual purchases generate writes. The primary MySQL databases for each shard were approaching their read capacity limits during previous peak events.
Solution
Shopify added multiple read replicas to each MySQL shard and implemented intelligent query routing in their Rails application. Product catalog reads, inventory availability checks (with a tolerance for seconds-old data), and store configuration reads were routed to replicas. Order creation, payment processing, and inventory deduction writes were routed to the primary. A custom middleware tracked recent writes per user session and temporarily routed that user's reads to the primary for 5 seconds after each write, implementing read-your-writes consistency without requiring synchronous replication.
Outcome
The read replica strategy allowed Shopify to handle 4x more read traffic during Black Friday 2023 compared to the previous year without adding primary database capacity. Replication lag remained under 100 milliseconds for 99.9% of the peak period. The primary databases experienced 60% less CPU utilization because the bulk of read queries were offloaded to replicas. The cost of the additional replicas was a fraction of what it would have cost to vertically scale the primary databases to handle the same read volume.
See Read Replicas in action
Explore system design templates that use read replicas and run traffic simulations to see how these concepts perform under real load.
Browse Templates1A user deposits money into their bank account and immediately checks their balance. The balance shows the old amount. What read consistency pattern would fix this?
2An application has 3 read replicas behind a load balancer. During a peak write burst, replication lag on Replica 3 grows to 30 seconds while Replicas 1 and 2 remain under 1 second. What should the query routing layer do?