#10. Isolation Levels & Anomalies - Safety vs Concurrency
The One Thing to Remember
Isolation levels are a knob between correctness and performance. Higher isolation = fewer anomalies but more conflicts. Most apps use READ COMMITTED; switch to SERIALIZABLE only when correctness is critical and you can handle retries.
Building on Article 9
In Article 9: ACID Transactions, you learned what ACID means. But here's the question: What are the different isolation levels, and what anomalies does each prevent?
Understanding isolation levels prevents "heisenbugs"—bugs that don't happen in testing but appear in production under load.
← Previous: Article 9 - ACID Transactions
Why This Matters (A Subtle Bug Story)
I once debugged a production issue where a financial report showed different totals when run twice. The mystery: The report ran multiple queries in one transaction, but data changed between queries. The fix? Use REPEATABLE READ isolation. The report now sees a consistent snapshot.
This isn't academic knowledge—it's the difference between:
-
Debugging data corruption in hours vs days
- Understanding isolation = you know what anomalies to look for
- Not understanding = you blame the application, the network, everything
-
Choosing the right isolation level
- Understanding trade-offs = you pick the right level
- Not understanding = you use SERIALIZABLE everywhere, performance suffers
-
Handling serialization failures correctly
- Understanding retries = you handle conflicts properly
- Not understanding = you get errors, blame the database
The Three Anomalies
1. Dirty Read
Transaction A Transaction B
═══════════════ ═══════════════
BEGIN
UPDATE balance = 0
WHERE id = 1
-- Not committed yet!
SELECT balance WHERE id = 1
→ Returns 0 ← DIRTY READ!
ROLLBACK
-- Change was undone!
-- Transaction A made decision based on
-- data that NEVER EXISTED in the DB!
PREVENTED BY: READ COMMITTED and above
Real example: A fraud check sees a zero balance during a failed transfer, flags the account incorrectly.
2. Non-Repeatable Read
Transaction A Transaction B
═══════════════ ═══════════════
BEGIN
SELECT balance WHERE id = 1
→ Returns $100
BEGIN
UPDATE balance = $50
WHERE id = 1
COMMIT
SELECT balance WHERE id = 1
→ Returns $50 ← DIFFERENT!
-- Same query, same transaction,
-- different results!
COMMIT
PREVENTED BY: REPEATABLE READ and above
Real example: A report calculates totals twice in the same transaction, gets different numbers.
3. Phantom Read
Transaction A Transaction B
═══════════════ ═══════════════
BEGIN
SELECT COUNT(*) FROM orders
WHERE status = 'pending'
→ Returns 5
BEGIN
INSERT INTO orders
(status) VALUES ('pending')
COMMIT
SELECT COUNT(*) FROM orders
WHERE status = 'pending'
→ Returns 6 ← NEW ROW APPEARED!
-- New rows "appeared" in Transaction A's view
COMMIT
PREVENTED BY: SERIALIZABLE (or PostgreSQL's REPEATABLE READ)
Real example: Pagination returns duplicate or missing items when data changes between pages.
Isolation Levels in Detail
READ COMMITTED (PostgreSQL Default)
┌─────────────────────────────────────────────────────────────────┐
│ READ COMMITTED - "Only trust committed data" │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Sees: Only committed changes (at statement start) │
│ │
│ Anomalies possible: │
│ ✗ Dirty reads (prevented) │
│ ✓ Non-repeatable reads │
│ ✓ Phantom reads │
│ │
│ PostgreSQL default: YES │
│ MySQL default: NO (uses REPEATABLE READ) │
│ │
│ Use case: Most OLTP applications │
│ │
│ How it works: │
│ Each SELECT sees a fresh snapshot of committed data. │
│ Different statements may see different committed versions. │
└─────────────────────────────────────────────────────────────────┘
REPEATABLE READ (MySQL Default)
┌─────────────────────────────────────────────────────────────────┐
│ REPEATABLE READ - "Freeze my view at start" │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Sees: Snapshot from transaction start │
│ │
│ Anomalies possible: │
│ ✗ Dirty reads (prevented) │
│ ✗ Non-repeatable reads (prevented) │
│ ✓ Phantom reads (in standard SQL) │
│ ✗ Phantom reads (PostgreSQL prevents these too!) │
│ │
│ PostgreSQL default: NO │
│ MySQL default: YES │
│ │
│ Use case: Reports, consistent reads within transaction │
│ │
│ How it works: │
│ Transaction sees snapshot from first query. │
│ All subsequent queries see same snapshot. │
│ │
│ Caveat: Write conflicts cause serialization failures. │
└─────────────────────────────────────────────────────────────────┘
SERIALIZABLE
┌─────────────────────────────────────────────────────────────────┐
│ SERIALIZABLE - "As if transactions ran one at a time" │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Guarantees: Result is equivalent to some serial execution │
│ │
│ Anomalies possible: │
│ ✗ Dirty reads (prevented) │
│ ✗ Non-repeatable reads (prevented) │
│ ✗ Phantom reads (prevented) │
│ ✗ Write skew (prevented) ← Extra protection! │
│ │
│ Use case: Financial transactions, inventory, critical data │
│ │
│ How it works (PostgreSQL - SSI): │
│ Serializable Snapshot Isolation │
│ • Tracks read/write dependencies │
│ • Detects dangerous patterns │
│ • Aborts one transaction on conflict │
│ │
│ CRITICAL: Must handle serialization failures with retry! │
│ │
│ Performance: 20-50% throughput reduction vs READ COMMITTED │
└─────────────────────────────────────────────────────────────────┘
Write Skew: The Hidden Anomaly
The Doctor On-Call Problem:
Rule: At least one doctor must be on call at all times.
Initial state:
Alice: on_call = true
Bob: on_call = true
Transaction A (Alice) Transaction B (Bob)
═══════════════════════ ═══════════════════════
BEGIN BEGIN
-- Check: Is someone else on call?
SELECT COUNT(*) FROM doctors SELECT COUNT(*) FROM doctors
WHERE on_call = true WHERE on_call = true
AND name != 'Alice' AND name != 'Bob'
→ Returns 1 (Bob) → Returns 1 (Alice)
-- Safe to go off call! -- Safe to go off call!
UPDATE doctors UPDATE doctors
SET on_call = false SET on_call = false
WHERE name = 'Alice' WHERE name = 'Bob'
COMMIT COMMIT
Result: BOTH doctors are off call!
Constraint violated!
This is WRITE SKEW - prevented only by SERIALIZABLE
Other write skew examples:
- Meeting room double-booking (both check room is free, both book it)
- Balance transfer between accounts (both see old total)
- Unique constraint across multiple rows (both check username available, both insert)
Common Mistakes (I've Made These)
Mistake #1: "SERIALIZABLE is always better"
Why it's wrong: SERIALIZABLE has 20-50% throughput reduction and requires retry logic. For most applications, READ COMMITTED is sufficient.
Right approach: Use SERIALIZABLE only when perfect correctness is required (financial transactions, inventory). For most apps, READ COMMITTED is fine.
Mistake #2: "Not handling serialization failures"
Why it's wrong: SERIALIZABLE transactions can fail with serialization errors. If you don't retry, you get errors and blame the database.
Right approach: Always retry on serialization failure. This is the standard pattern for SERIALIZABLE transactions.
Mistake #3: "Isolation levels don't matter, my app is simple"
Why it's wrong: Even simple apps can have race conditions. I've seen inventory overselling, duplicate bookings, and data corruption from wrong isolation levels.
Right approach: Understand what your database's default isolation level allows, and use higher isolation when correctness matters.
Code Examples
Setting Isolation Levels
import psycopg2
from psycopg2.extensions import (
ISOLATION_LEVEL_READ_COMMITTED,
ISOLATION_LEVEL_REPEATABLE_READ,
ISOLATION_LEVEL_SERIALIZABLE
)
# Method 1: Connection level
conn = psycopg2.connect(dsn)
conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
# Method 2: Transaction level (PostgreSQL)
cursor.execute("BEGIN ISOLATION LEVEL SERIALIZABLE")
Handling Serialization Failures (The Correct Pattern)
import psycopg2
from psycopg2 import errors
import random
import time
def run_serializable_transaction(operation):
"""
Run operation with SERIALIZABLE isolation, retry on conflict.
This is THE correct pattern for SERIALIZABLE transactions.
"""
max_retries = 5
for attempt in range(max_retries):
conn = psycopg2.connect(dsn)
conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
try:
result = operation(conn)
conn.commit()
return result
except errors.SerializationFailure:
conn.rollback()
if attempt == max_retries - 1:
raise
# Exponential backoff with jitter
delay = (0.1 * (2 ** attempt)) + random.uniform(0, 0.1)
time.sleep(delay)
except Exception:
conn.rollback()
raise
finally:
conn.close()
Database Comparison
| Database | Default Level | Notes |
|---|---|---|
| PostgreSQL | READ COMMITTED | REPEATABLE READ prevents phantoms too |
| MySQL InnoDB | REPEATABLE READ | Phantoms possible via locking reads |
| SQL Server | READ COMMITTED | Has snapshot isolation option |
| Oracle | READ COMMITTED | Serializable via snapshot |
| SQLite | SERIALIZABLE | Single-writer simplifies things |
Key insight: PostgreSQL and MySQL have different defaults! Same code, different behavior. Always explicitly set isolation level for critical transactions.
Decision Framework
□ Are dirty reads acceptable?
→ Never in production: Use READ COMMITTED minimum
□ Does the transaction re-read the same data?
→ Yes: Consider REPEATABLE READ
→ No: READ COMMITTED is fine
□ Does correctness require consistent view of related data?
→ Yes (e.g., sum of balances): REPEATABLE READ or higher
→ No: READ COMMITTED is fine
□ Could write skew cause problems?
→ Yes: Must use SERIALIZABLE
→ No: REPEATABLE READ is sufficient
□ Can you handle serialization failures?
→ Yes: SERIALIZABLE is safe choice
→ No: Use explicit locking (SELECT FOR UPDATE)
□ What's your conflict rate?
→ Low (<1%): SERIALIZABLE works well
→ High (>10%): Consider READ COMMITTED + application logic
Key Takeaways
- READ COMMITTED is fine for most applications (PostgreSQL default)
- REPEATABLE READ for reports/consistent multi-query transactions
- SERIALIZABLE for financial/inventory - but handle retries! (20-50% overhead)
- Write skew is only prevented by SERIALIZABLE
- Know your database's default - PostgreSQL (READ COMMITTED) ≠ MySQL (REPEATABLE READ)
- Retry is mandatory for SERIALIZABLE transactions
- Isolation bugs are subtle - they don't happen in testing, only in production under load
What's Next
Now that you understand isolation levels, the next question is: When should you choose SQL vs NoSQL for your data?
In the next article, SQL vs NoSQL Decision Guide - Flexibility vs Scale, you'll learn:
- When SQL is the right choice (and when it's not)
- When NoSQL makes sense (and when it doesn't)
- The trade-offs between consistency and scale
- How most production systems use multiple databases
This builds on what you learned here—SQL databases provide strong ACID guarantees, but NoSQL databases offer different trade-offs.
→ Continue to Article 11: SQL vs NoSQL
This article is part of the Backend Engineering Mastery series. Understanding isolation levels prevents subtle data corruption bugs.