5 min read

#10. Isolation Levels & Anomalies - Safety vs Concurrency

The One Thing to Remember

Isolation levels are a knob between correctness and performance. Higher isolation = fewer anomalies but more conflicts. Most apps use READ COMMITTED; switch to SERIALIZABLE only when correctness is critical and you can handle retries.


Building on Article 9

In Article 9: ACID Transactions, you learned what ACID means. But here's the question: What are the different isolation levels, and what anomalies does each prevent?

Understanding isolation levels prevents "heisenbugs"—bugs that don't happen in testing but appear in production under load.

Previous: Article 9 - ACID Transactions


Why This Matters (A Subtle Bug Story)

I once debugged a production issue where a financial report showed different totals when run twice. The mystery: The report ran multiple queries in one transaction, but data changed between queries. The fix? Use REPEATABLE READ isolation. The report now sees a consistent snapshot.

This isn't academic knowledge—it's the difference between:

  • Debugging data corruption in hours vs days

    • Understanding isolation = you know what anomalies to look for
    • Not understanding = you blame the application, the network, everything
  • Choosing the right isolation level

    • Understanding trade-offs = you pick the right level
    • Not understanding = you use SERIALIZABLE everywhere, performance suffers
  • Handling serialization failures correctly

    • Understanding retries = you handle conflicts properly
    • Not understanding = you get errors, blame the database

The Three Anomalies

1. Dirty Read

Transaction A                      Transaction B
═══════════════                    ═══════════════

                                   BEGIN
                                   UPDATE balance = 0 
                                     WHERE id = 1
                                   -- Not committed yet!

SELECT balance WHERE id = 1
→ Returns 0  ← DIRTY READ!

                                   ROLLBACK
                                   -- Change was undone!

-- Transaction A made decision based on
-- data that NEVER EXISTED in the DB!

PREVENTED BY: READ COMMITTED and above

Real example: A fraud check sees a zero balance during a failed transfer, flags the account incorrectly.

2. Non-Repeatable Read

Transaction A                      Transaction B
═══════════════                    ═══════════════

BEGIN

SELECT balance WHERE id = 1
→ Returns $100

                                   BEGIN
                                   UPDATE balance = $50 
                                     WHERE id = 1
                                   COMMIT

SELECT balance WHERE id = 1
→ Returns $50  ← DIFFERENT!

-- Same query, same transaction,
-- different results!

COMMIT

PREVENTED BY: REPEATABLE READ and above

Real example: A report calculates totals twice in the same transaction, gets different numbers.

3. Phantom Read

Transaction A                      Transaction B
═══════════════                    ═══════════════

BEGIN

SELECT COUNT(*) FROM orders 
  WHERE status = 'pending'
→ Returns 5

                                   BEGIN
                                   INSERT INTO orders 
                                     (status) VALUES ('pending')
                                   COMMIT

SELECT COUNT(*) FROM orders 
  WHERE status = 'pending'
→ Returns 6  ← NEW ROW APPEARED!

-- New rows "appeared" in Transaction A's view

COMMIT

PREVENTED BY: SERIALIZABLE (or PostgreSQL's REPEATABLE READ)

Real example: Pagination returns duplicate or missing items when data changes between pages.


Isolation Levels in Detail

READ COMMITTED (PostgreSQL Default)

┌─────────────────────────────────────────────────────────────────┐
│ READ COMMITTED - "Only trust committed data"                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│ Sees: Only committed changes (at statement start)               │
│                                                                 │
│ Anomalies possible:                                             │
│ ✗ Dirty reads (prevented)                                       │
│ ✓ Non-repeatable reads                                          │
│ ✓ Phantom reads                                                 │
│                                                                 │
│ PostgreSQL default: YES                                         │
│ MySQL default: NO (uses REPEATABLE READ)                        │
│                                                                 │
│ Use case: Most OLTP applications                                │
│                                                                 │
│ How it works:                                                   │
│ Each SELECT sees a fresh snapshot of committed data.            │
│ Different statements may see different committed versions.      │
└─────────────────────────────────────────────────────────────────┘

REPEATABLE READ (MySQL Default)

┌─────────────────────────────────────────────────────────────────┐
│ REPEATABLE READ - "Freeze my view at start"                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│ Sees: Snapshot from transaction start                           │
│                                                                 │
│ Anomalies possible:                                             │
│ ✗ Dirty reads (prevented)                                       │
│ ✗ Non-repeatable reads (prevented)                              │
│ ✓ Phantom reads (in standard SQL)                              │
│ ✗ Phantom reads (PostgreSQL prevents these too!)                │
│                                                                 │
│ PostgreSQL default: NO                                          │
│ MySQL default: YES                                              │
│                                                                 │
│ Use case: Reports, consistent reads within transaction          │
│                                                                 │
│ How it works:                                                   │
│ Transaction sees snapshot from first query.                     │
│ All subsequent queries see same snapshot.                       │
│                                                                 │
│ Caveat: Write conflicts cause serialization failures.           │
└─────────────────────────────────────────────────────────────────┘

SERIALIZABLE

┌─────────────────────────────────────────────────────────────────┐
│ SERIALIZABLE - "As if transactions ran one at a time"           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│ Guarantees: Result is equivalent to some serial execution       │
│                                                                 │
│ Anomalies possible:                                             │
│ ✗ Dirty reads (prevented)                                       │
│ ✗ Non-repeatable reads (prevented)                              │
│ ✗ Phantom reads (prevented)                                     │
│ ✗ Write skew (prevented) ← Extra protection!                   │
│                                                                 │
│ Use case: Financial transactions, inventory, critical data       │
│                                                                 │
│ How it works (PostgreSQL - SSI):                                │
│ Serializable Snapshot Isolation                                 │
│ • Tracks read/write dependencies                                │
│ • Detects dangerous patterns                                    │
│ • Aborts one transaction on conflict                            │
│                                                                 │
│ CRITICAL: Must handle serialization failures with retry!        │
│                                                                 │
│ Performance: 20-50% throughput reduction vs READ COMMITTED      │
└─────────────────────────────────────────────────────────────────┘

Write Skew: The Hidden Anomaly

The Doctor On-Call Problem:

Rule: At least one doctor must be on call at all times.

Initial state:
  Alice: on_call = true
  Bob:   on_call = true

Transaction A (Alice)              Transaction B (Bob)
═══════════════════════           ═══════════════════════

BEGIN                              BEGIN

-- Check: Is someone else on call?
SELECT COUNT(*) FROM doctors       SELECT COUNT(*) FROM doctors
  WHERE on_call = true               WHERE on_call = true
  AND name != 'Alice'                AND name != 'Bob'
→ Returns 1 (Bob)                  → Returns 1 (Alice)

-- Safe to go off call!            -- Safe to go off call!
UPDATE doctors                     UPDATE doctors
  SET on_call = false                SET on_call = false
  WHERE name = 'Alice'               WHERE name = 'Bob'

COMMIT                             COMMIT

Result: BOTH doctors are off call!
        Constraint violated!
        
This is WRITE SKEW - prevented only by SERIALIZABLE

Other write skew examples:

  • Meeting room double-booking (both check room is free, both book it)
  • Balance transfer between accounts (both see old total)
  • Unique constraint across multiple rows (both check username available, both insert)

Common Mistakes (I've Made These)

Mistake #1: "SERIALIZABLE is always better"

Why it's wrong: SERIALIZABLE has 20-50% throughput reduction and requires retry logic. For most applications, READ COMMITTED is sufficient.

Right approach: Use SERIALIZABLE only when perfect correctness is required (financial transactions, inventory). For most apps, READ COMMITTED is fine.

Mistake #2: "Not handling serialization failures"

Why it's wrong: SERIALIZABLE transactions can fail with serialization errors. If you don't retry, you get errors and blame the database.

Right approach: Always retry on serialization failure. This is the standard pattern for SERIALIZABLE transactions.

Mistake #3: "Isolation levels don't matter, my app is simple"

Why it's wrong: Even simple apps can have race conditions. I've seen inventory overselling, duplicate bookings, and data corruption from wrong isolation levels.

Right approach: Understand what your database's default isolation level allows, and use higher isolation when correctness matters.


Code Examples

Setting Isolation Levels

import psycopg2
from psycopg2.extensions import (
    ISOLATION_LEVEL_READ_COMMITTED,
    ISOLATION_LEVEL_REPEATABLE_READ,
    ISOLATION_LEVEL_SERIALIZABLE
)

# Method 1: Connection level
conn = psycopg2.connect(dsn)
conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)

# Method 2: Transaction level (PostgreSQL)
cursor.execute("BEGIN ISOLATION LEVEL SERIALIZABLE")

Handling Serialization Failures (The Correct Pattern)

import psycopg2
from psycopg2 import errors
import random
import time

def run_serializable_transaction(operation):
    """
    Run operation with SERIALIZABLE isolation, retry on conflict.
    
    This is THE correct pattern for SERIALIZABLE transactions.
    """
    max_retries = 5
    
    for attempt in range(max_retries):
        conn = psycopg2.connect(dsn)
        conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
        
        try:
            result = operation(conn)
            conn.commit()
            return result
            
        except errors.SerializationFailure:
            conn.rollback()
            if attempt == max_retries - 1:
                raise
            
            # Exponential backoff with jitter
            delay = (0.1 * (2 ** attempt)) + random.uniform(0, 0.1)
            time.sleep(delay)
            
        except Exception:
            conn.rollback()
            raise
            
        finally:
            conn.close()

Database Comparison

Database Default Level Notes
PostgreSQL READ COMMITTED REPEATABLE READ prevents phantoms too
MySQL InnoDB REPEATABLE READ Phantoms possible via locking reads
SQL Server READ COMMITTED Has snapshot isolation option
Oracle READ COMMITTED Serializable via snapshot
SQLite SERIALIZABLE Single-writer simplifies things

Key insight: PostgreSQL and MySQL have different defaults! Same code, different behavior. Always explicitly set isolation level for critical transactions.


Decision Framework

□ Are dirty reads acceptable?
  → Never in production: Use READ COMMITTED minimum

□ Does the transaction re-read the same data?
  → Yes: Consider REPEATABLE READ
  → No: READ COMMITTED is fine

□ Does correctness require consistent view of related data?
  → Yes (e.g., sum of balances): REPEATABLE READ or higher
  → No: READ COMMITTED is fine

□ Could write skew cause problems?
  → Yes: Must use SERIALIZABLE
  → No: REPEATABLE READ is sufficient

□ Can you handle serialization failures?
  → Yes: SERIALIZABLE is safe choice
  → No: Use explicit locking (SELECT FOR UPDATE)

□ What's your conflict rate?
  → Low (<1%): SERIALIZABLE works well
  → High (>10%): Consider READ COMMITTED + application logic

Key Takeaways

  1. READ COMMITTED is fine for most applications (PostgreSQL default)
  2. REPEATABLE READ for reports/consistent multi-query transactions
  3. SERIALIZABLE for financial/inventory - but handle retries! (20-50% overhead)
  4. Write skew is only prevented by SERIALIZABLE
  5. Know your database's default - PostgreSQL (READ COMMITTED) ≠ MySQL (REPEATABLE READ)
  6. Retry is mandatory for SERIALIZABLE transactions
  7. Isolation bugs are subtle - they don't happen in testing, only in production under load

What's Next

Now that you understand isolation levels, the next question is: When should you choose SQL vs NoSQL for your data?

In the next article, SQL vs NoSQL Decision Guide - Flexibility vs Scale, you'll learn:

  • When SQL is the right choice (and when it's not)
  • When NoSQL makes sense (and when it doesn't)
  • The trade-offs between consistency and scale
  • How most production systems use multiple databases

This builds on what you learned here—SQL databases provide strong ACID guarantees, but NoSQL databases offer different trade-offs.

Continue to Article 11: SQL vs NoSQL


This article is part of the Backend Engineering Mastery series. Understanding isolation levels prevents subtle data corruption bugs.