Vetora logo
๐Ÿ“ธConsistency & Transactions

Snapshot Isolation

Snapshot isolation (SI) is a multi-version concurrency control (MVCC) technique where each transaction reads from a consistent snapshot of the database as of its start time. Reads never block writes and writes never block reads, providing excellent read performance while preventing most concurrency anomalies -- except write skew.

Overview

Snapshot isolation, first formally described by Hal Berenson et al. in their 1995 critique of ANSI SQL isolation levels, is implemented via Multi-Version Concurrency Control (MVCC). Instead of maintaining a single current version of each row, the database keeps multiple versions -- each write creates a new version with a timestamp or transaction ID. When a transaction starts, it is assigned a snapshot timestamp. All reads within that transaction see the latest version of each row that was committed before the snapshot timestamp, providing a consistent, frozen view of the database.

The key property of SI is that readers and writers do not block each other. A long-running analytical query can read a consistent snapshot while concurrent transactions insert, update, and delete rows. The analytical query sees the database as it was at its start time, unaffected by later changes. This is a dramatic improvement over read-committed or locking-based isolation, where long reads can be blocked by writes and vice versa.

SI prevents most concurrency anomalies. Dirty reads are impossible because transactions only see committed versions. Non-repeatable reads are impossible because the snapshot is fixed at transaction start. Phantom reads are prevented because new rows inserted by concurrent transactions are not visible in the snapshot. However, SI allows one important anomaly: write skew. Write skew occurs when two transactions read overlapping data, make decisions based on their reads, and write to different rows -- the combined effect violates an invariant that would be maintained under serial execution.

Write skew is the primary reason SI is weaker than serializability. The classic example is two on-call doctors checking the count (both see 2), both going off-call (each writes to their own row), and the result being 0 on-call doctors. Under SI, both transactions see the snapshot state (2 doctors), so both proceed. Under serializability, one transaction would see the other's write and abort. PostgreSQL addresses this by offering both REPEATABLE READ (SI) and SERIALIZABLE (SSI), letting applications choose the appropriate level per transaction.

Key Points
  • 1MVCC keeps multiple versions of each row. Each transaction reads the version committed before its snapshot timestamp. Old versions are garbage-collected after no active transaction needs them (vacuum in PostgreSQL).
  • 2SI provides a consistent, point-in-time view of the database. A transaction sees ALL changes committed before it started and NONE committed after, even if concurrent transactions commit during its execution.
  • 3First-committer-wins rule: if two concurrent transactions write to the same row, the first to commit succeeds; the second is aborted with a serialization error. This prevents lost updates on the same row.
  • 4Write skew is the critical anomaly SI does NOT prevent. Two transactions can read overlapping data and write to different rows, violating a cross-row invariant. SI cannot detect this because neither transaction writes to a row the other read.
  • 5PostgreSQL's REPEATABLE READ is snapshot isolation. Oracle's default isolation (also called 'serializable' by Oracle, controversially) is snapshot isolation. MySQL InnoDB's REPEATABLE READ is approximately SI but has some differences in gap locking behavior.
  • 6SI significantly improves read performance for OLAP/analytical workloads because long-running queries do not block or get blocked by OLTP writes. This is the primary reason for MVCC's dominance in modern databases.
Simple Example

Counting Inventory While Processing Orders

An inventory report starts reading all product counts at 10:00 AM (snapshot timestamp). While the report runs (takes 5 minutes), customers place orders that decrement inventory. The report sees inventory as it was at 10:00 AM -- consistent and unaffected by the concurrent orders. The orders see the latest data and proceed normally. Without snapshot isolation, the report might see some products at pre-order counts and others at post-order counts, producing an inconsistent total.

Real-World Examples

PostgreSQL (MVCC + REPEATABLE READ)

PostgreSQL's MVCC implementation stores row versions with xmin (creating transaction ID) and xmax (deleting transaction ID). Each transaction has a snapshot of active transaction IDs at start time. When reading a row, PostgreSQL checks whether the row's xmin is committed and visible in the snapshot, and xmax is not. Vacuum removes dead row versions. REPEATABLE READ provides full SI; READ COMMITTED provides statement-level SI (each statement gets a fresh snapshot).

CockroachDB

CockroachDB uses MVCC at the storage layer (Pebble) to provide SI. Each key has multiple timestamped versions. Transactions read at a fixed timestamp and see only versions committed before that timestamp. CockroachDB goes beyond SI to provide serializable isolation by detecting read-write conflicts that could cause write skew (similar to PostgreSQL's SSI). Write intents (provisional writes) are used to detect conflicts.

Oracle Database

Oracle's default isolation level uses snapshot isolation implemented via undo segments (rollback segments). Read operations reconstruct the snapshot view by applying undo records to get the version visible at the transaction's start SCN (System Change Number). Oracle controversially calls this 'serializable' even though it is snapshot isolation and allows write skew.

Trade-Offs
AspectDescription
Read Performance vs Storage OverheadMVCC eliminates read-write contention, dramatically improving concurrent read throughput. The cost is storage overhead: multiple versions of each row consume disk space. Databases mitigate this with periodic garbage collection (PostgreSQL's VACUUM, InnoDB's purge thread), but long-running transactions prevent cleanup of versions they might still need.
Anomaly Prevention vs Write SkewSI prevents dirty reads, non-repeatable reads, phantom reads, and lost updates on the same row. But it allows write skew across different rows. Applications must either use SERIALIZABLE isolation for skew-prone transactions or add explicit locking (SELECT FOR UPDATE) on the rows that guard the invariant.
Long Transactions vs Vacuum PressureLong-running transactions hold their snapshot open, preventing the database from cleaning up old row versions. In PostgreSQL, this causes table bloat (dead tuples accumulate) and can degrade performance. Monitoring and canceling long-idle transactions is an operational necessity.
Consistency vs Freshness for ReadsSI provides a consistent snapshot, but the snapshot may be 'stale' relative to concurrent commits. A transaction started at 10:00 AM will not see a write committed at 10:01 AM, even if it reads the affected row at 10:05 AM. For most applications this is acceptable, but for time-critical decisions, the transaction should be restarted with a fresh snapshot.
Case Study

PostgreSQL MVCC: From Read Locks to Multi-Version Concurrency

Scenario

Early database systems used lock-based concurrency control: readers acquired shared locks, writers acquired exclusive locks, and long reads blocked writes (and vice versa). This was acceptable for OLTP workloads with short transactions but devastating for mixed OLTP+OLAP workloads, where a long analytical query could block all writes for minutes.

Solution

PostgreSQL (originally Postgres, by Michael Stonebraker) pioneered MVCC in relational databases. Instead of locking rows, writes create new versions. Readers see the version valid at their snapshot time. Writers abort only on write-write conflicts (first-committer-wins). This decouples reads from writes entirely -- a multi-hour analytical query runs alongside real-time OLTP without either blocking the other.

Outcome

MVCC became the dominant concurrency control mechanism in modern databases. Oracle, MySQL InnoDB, SQL Server (with RCSI), CockroachDB, and TiDB all use MVCC. PostgreSQL's implementation has been refined over 30+ years, with improvements to vacuum efficiency, snapshot management, and the addition of SSI for serializable isolation. The MVCC approach enabled the 'HTAP' (Hybrid Transactional/Analytical Processing) category, where one database serves both workloads.

Common Mistakes
  • โš Assuming snapshot isolation prevents all anomalies. Write skew is a real and often subtle bug. If your transaction checks a condition (count > 0, balance > amount) and then writes based on that condition, another concurrent transaction may invalidate the condition without SI detecting the conflict.
  • โš Confusing PostgreSQL's REPEATABLE READ with MySQL's REPEATABLE READ. PostgreSQL's REPEATABLE READ is true snapshot isolation. MySQL InnoDB's REPEATABLE READ uses MVCC for reads but next-key locking for writes, producing slightly different behavior (it prevents some phantoms that SI technically allows).
  • โš Not handling serialization errors. Under SI (and especially SSI), write-write conflicts cause transaction aborts. Applications must catch these errors and retry the transaction. Not retrying causes random, intermittent failures.
  • โš Running long transactions on MVCC databases without monitoring. A forgotten idle transaction with an open snapshot prevents vacuum from cleaning up old row versions, eventually bloating the table and degrading performance for all users.
Related Concepts

See Snapshot Isolation in action

Explore system design templates that use snapshot isolation and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Compare snapshot isolation vs serializable for product inventory

Metrics to watch
write_skew_countread_latency_msthroughput_tpslock_wait_ms
Run Simulation
Test Your Understanding

1Under snapshot isolation, what happens when two concurrent transactions try to update the same row?

2What concurrency anomaly does snapshot isolation allow that serializability prevents?

Deeper Reading