Vetora logo
๐Ÿ›ก๏ธConsistency & Transactions

Database Isolation Levels

Isolation levels define how and when changes made by one transaction become visible to other concurrent transactions. The SQL standard defines four levels -- Read Uncommitted, Read Committed, Repeatable Read, and Serializable -- but real-world implementations (snapshot isolation, SSI) diverge significantly from the standard's definitions.

Overview

The SQL standard (SQL-92) defines four isolation levels based on the anomalies they prevent: Read Uncommitted (prevents nothing), Read Committed (prevents dirty reads), Repeatable Read (prevents dirty + non-repeatable reads), and Serializable (prevents all anomalies). However, Berenson et al.'s 1995 critique demonstrated that the standard's definitions are ambiguous and incomplete -- they fail to account for snapshot isolation, write skew, and other phenomena observed in real databases.

In practice, each database interprets isolation levels differently. PostgreSQL's Read Committed uses MVCC snapshots per statement; its Repeatable Read is true snapshot isolation (SI); its Serializable uses SSI. MySQL InnoDB's Repeatable Read uses MVCC for reads and next-key locking for writes, preventing some phantoms. Oracle's 'Serializable' is actually snapshot isolation (allowing write skew). SQL Server offers both lock-based and snapshot-based isolation (Read Committed Snapshot Isolation, RCSI). Understanding your specific database's behavior at each level is more important than memorizing the SQL standard.

The anomalies form a hierarchy. Dirty reads occur when a transaction reads uncommitted data from another transaction. Non-repeatable reads occur when a transaction reads the same row twice and gets different values (because another transaction committed a change in between). Phantom reads occur when a transaction re-executes a query and gets different rows (because another transaction inserted or deleted rows matching the predicate). Write skew occurs when two transactions read overlapping data and write to different rows, violating a cross-row invariant. Lost updates occur when two transactions read the same value, compute a new value based on it, and one overwrites the other's result.

Choosing an isolation level is a trade-off between correctness and concurrency. Stronger isolation prevents more anomalies but reduces throughput (more locking, more aborts). Weaker isolation allows more concurrency but shifts anomaly handling to the application. Most applications use Read Committed as the default (it prevents the most egregious anomaly -- dirty reads -- with minimal performance impact) and upgrade to Serializable only for transactions that maintain cross-row invariants (e.g., financial transfers, inventory reservations).

Key Points
  • 1Read Uncommitted: Allows dirty reads. Almost never used in practice because reading uncommitted data can lead to cascading aborts (a transaction reads a value that is later rolled back, making its own output invalid).
  • 2Read Committed (default in PostgreSQL, Oracle, SQL Server): Each SQL statement sees a fresh snapshot of committed data. Prevents dirty reads but allows non-repeatable reads (re-reading a row may return a different value) and phantoms (re-executing a query may return different rows).
  • 3Repeatable Read / Snapshot Isolation: The transaction sees a consistent snapshot from its start. Prevents dirty reads, non-repeatable reads, and phantoms (in MVCC implementations). Allows write skew. PostgreSQL's RR is SI; MySQL's RR uses gap locks instead of MVCC for phantom prevention.
  • 4Serializable: The strongest level. Equivalent to serial execution. Prevents all anomalies including write skew. Implemented via 2PL (lock-based), actual serial execution (single-threaded), or SSI (optimistic conflict detection).
  • 5Real databases diverge from the SQL standard. Oracle's 'serializable' is SI. PostgreSQL's 'repeatable read' is SI. MySQL's 'repeatable read' uses gap locks. Always consult your database's documentation, not the SQL standard, for actual behavior.
  • 6Per-transaction isolation is possible. A read-only dashboard query can use Read Committed (fast, loose) while a funds transfer uses Serializable (safe, strict). Most ORMs and connection libraries support per-transaction isolation level configuration.
Simple Example

The Non-Repeatable Read

Under Read Committed: Transaction A reads Alice's balance as $1000. Transaction B transfers $500 from Alice to Bob and commits. Transaction A reads Alice's balance again -- it is now $500. The balance changed between two reads in the same transaction. Under Repeatable Read, Transaction A's second read still returns $1000 because it sees the snapshot from when it started. Under both levels, the final committed state is correct; the difference is what A sees during its execution.

Real-World Examples

PostgreSQL

PostgreSQL offers Read Uncommitted (treated as Read Committed), Read Committed (default, per-statement snapshots), Repeatable Read (true SI, per-transaction snapshot), and Serializable (SSI with conflict detection). The distinction between RR and Serializable is significant: RR allows write skew, Serializable does not. PostgreSQL's documentation explicitly warns about write skew at the RR level.

MySQL InnoDB

MySQL InnoDB's default is Repeatable Read, which uses MVCC for reads and next-key locking for writes. Unlike PostgreSQL's RR, MySQL's RR prevents some phantoms via gap locks on index ranges. MySQL's Serializable level adds shared read locks, effectively implementing 2PL. The behavior differences between MySQL and PostgreSQL at the same named isolation level catch many developers by surprise.

SQL Server (RCSI)

SQL Server offers both lock-based and snapshot-based isolation. Read Committed Snapshot Isolation (RCSI) uses MVCC to serve read-committed queries without read locks -- similar to Oracle's behavior. Snapshot Isolation (SI) provides true snapshot isolation with first-committer-wins for write conflicts. Both are separate from lock-based Read Committed, which acquires and releases read locks per statement.

Trade-Offs
AspectDescription
Correctness vs ConcurrencyStronger isolation prevents more anomalies but reduces concurrent throughput. Under Serializable (2PL), read-write conflicts cause blocking; under SSI, they cause aborts. Under Read Committed, readers and writers rarely interfere. The optimal level depends on the application's tolerance for anomalies vs its throughput requirements.
Default Level ChoicePostgreSQL defaults to Read Committed; MySQL defaults to Repeatable Read. These defaults shape how applications are built. A PostgreSQL app must explicitly request RR/Serializable for snapshot reads; a MySQL app gets snapshot reads by default. Migrating between databases without understanding these defaults causes subtle bugs.
Standard vs ImplementationThe SQL standard defines isolation levels by anomaly prevention. Actual databases implement them via MVCC, locking, or hybrid approaches, producing behavior that does not cleanly map to the standard. Relying on the standard's definitions rather than your database's documentation leads to incorrect assumptions.
Global vs Per-TransactionSetting Serializable globally (all transactions) provides maximum safety but may reduce throughput significantly. Setting it per-transaction (only for critical operations) optimizes performance but requires developers to correctly identify which transactions need stronger isolation -- a source of bugs when invariant-critical transactions are missed.
Case Study

Berenson et al.'s Critique of ANSI SQL Isolation Levels

Scenario

The SQL-92 standard defined isolation levels based on three phenomena: dirty reads, non-repeatable reads, and phantom reads. Database vendors (Microsoft, Oracle, Sybase) found that these definitions were ambiguous and incomplete -- they did not account for phenomena observed in MVCC-based implementations like snapshot isolation, and they used imprecise English rather than formal specifications.

Solution

Berenson, Bernstein, Gray, Melton, O'Neil, and O'Neil published 'A Critique of ANSI SQL Isolation Levels' in 1995. They formally defined anomalies (including write skew and lost updates), showed that snapshot isolation prevents more anomalies than the standard's Repeatable Read but allows write skew, and proposed a clearer taxonomy. They demonstrated that the standard's definition of Serializable was correct but the intermediate levels were underspecified.

Outcome

The paper became the definitive reference for understanding transaction isolation. It revealed that Oracle's 'serializable' was actually SI (not true serializability), leading to years of debate. The paper's anomaly taxonomy (P0-P3, A1-A5B) is still used in database research. It directly motivated the development of SSI (to close the write-skew gap in SI) and influenced the design of every modern database's isolation level documentation.

Common Mistakes
  • โš Assuming isolation level names are portable across databases. PostgreSQL's Repeatable Read, MySQL's Repeatable Read, and Oracle's Serializable all provide snapshot isolation but with different nuances. Test your application against your specific database, not the SQL standard.
  • โš Using Read Committed for transactions that check-then-write. Under RC, two concurrent transactions can both read the same value (e.g., inventory=10), both pass a check (inventory > 0), and both write (inventory=9), resulting in a lost update. Use Repeatable Read or SELECT FOR UPDATE.
  • โš Not distinguishing between read-only and read-write transactions. Read-only queries rarely need Serializable isolation; they do not produce write anomalies. Overusing Serializable for read-only queries adds unnecessary overhead.
  • โš Ignoring the different locking behaviors. MySQL's Repeatable Read uses gap locks that can cause deadlocks not seen in PostgreSQL's MVCC-based RR. Understanding your database's specific locking behavior is essential for debugging production deadlocks.
Related Concepts

See Database Isolation Levels in action

Explore system design templates that use database isolation levels and run traffic simulations to see how these concepts perform under real load.

Browse Templates

Step through isolation levels from Read Uncommitted to Serializable

Metrics to watch
dirty_read_countphantom_read_countthroughput_tpsabort_rate_pct
Run Simulation
Test Your Understanding

1Which isolation level is the default in PostgreSQL?

2Oracle's 'SERIALIZABLE' isolation level actually provides which guarantee?

Deeper Reading