LinkedIn Hook

"Your REPEATABLE READ transaction quietly let two users withdraw the last $100 from a shared account. Both saw the same balance. Both passed the check. Both committed. The bank is now $100 short."

Most engineers stop learning isolation levels at READ COMMITTED. A few brave ones graduate to REPEATABLE READ and feel safe because phantom reads are blocked. Almost nobody actually understands SERIALIZABLE — what it costs, what it guards against, and why PostgreSQL implements it with something called Serializable Snapshot Isolation (SSI) instead of the textbook two-phase locking approach.

Here is the uncomfortable truth: REPEATABLE READ in Postgres prevents phantom reads on the same query, but it does not prevent write skew — the anomaly where two concurrent transactions each read a consistent snapshot, each make a decision based on that snapshot, and each commit a write that violates an invariant the snapshot promised. The bank balance scenario above is textbook write skew. Snapshot isolation cannot stop it. SERIALIZABLE can.

PostgreSQL's SERIALIZABLE is special because it does not lock readers. It uses SSI to track read-write dependencies between concurrent transactions and aborts the offender at commit time with a serialization_failure error (SQLSTATE 40001). Your job as the application developer is to wrap every SERIALIZABLE transaction in a retry loop that catches 40001 and tries again — usually 2-3 attempts is enough. The performance is excellent for read-heavy workloads, the correctness guarantee is total, and the only real cost is that you have to write the retry loop.

In Lesson 11.6, I close the chapter on transactions with a complete tour of SERIALIZABLE: the SSI algorithm in plain English, the serialization_failure retry pattern, the difference between Postgres SSI and MySQL InnoDB's lock-based SERIALIZABLE, and a full anomaly matrix showing exactly which read phenomena each isolation level prevents.

Read the full lesson -> [link]

#PostgreSQL #Database #SQL #Transactions #ACID #BackendDevelopment #InterviewPrep #ConcurrencyControl

SERIALIZABLE and Isolation Level Comparison thumbnail

What You'll Learn

What SERIALIZABLE actually guarantees, and why it is the only level that prevents write skew
How PostgreSQL implements SERIALIZABLE with Serializable Snapshot Isolation (SSI) instead of two-phase locking
Why SSI never blocks readers and how it detects dangerous read-write dependency cycles
How to recognize and handle serialization_failure (SQLSTATE 40001) with a retry loop
The difference between PostgreSQL SSI and MySQL InnoDB's lock-based SERIALIZABLE (gap locks, next-key locks)
The complete anomaly matrix: which read phenomena each of the four isolation levels prevents
When SERIALIZABLE is the right default and when READ COMMITTED is good enough
How to choose an isolation level based on the actual invariants your business logic depends on

The Hospital On-Call Roster Analogy — Why Snapshots Are Not Enough

Picture a hospital ward with a strict rule: at least one doctor must be on call at all times. Right now there are exactly two doctors on call — Alice and Bob. Both Alice and Bob feel sick and want to go home. Both open the staff portal at the same instant.

Alice's screen says: "There are 2 doctors on call. You may take yourself off." Alice clicks "Remove me."

At the same instant, on a different terminal, Bob's screen also says: "There are 2 doctors on call. You may take yourself off." Bob also clicks "Remove me."

Each of them read a perfectly consistent snapshot. Each of them made a perfectly valid decision based on that snapshot. Each of them committed a write that, in isolation, did not break the rule. But the combined effect of both writes leaves the ward with zero doctors on call, violating the invariant the system was supposed to protect.

This is write skew, and it is the anomaly that snapshot isolation (which Postgres calls REPEATABLE READ) cannot prevent. Each transaction sees a snapshot frozen at its start time. Neither sees the other's write. Both check "count >= 2" and both pass. Both commit. The invariant is broken, and no error was raised.

+----------------------------------------------------------------+
|           WRITE SKEW UNDER SNAPSHOT ISOLATION                  |
+----------------------------------------------------------------+
|                                                                |
|  Initial state: oncall = { Alice, Bob }                        |
|                                                                |
|  T1 (Alice)                       T2 (Bob)                     |
|  -----------                      ---------                    |
|  BEGIN ISOLATION REPEATABLE READ  BEGIN ISOLATION REPEATABLE READ |
|  SELECT count(*) FROM oncall;     SELECT count(*) FROM oncall; |
|         -> 2  (snapshot)                  -> 2  (snapshot)     |
|  IF count >= 2:                   IF count >= 2:               |
|    UPDATE oncall                    UPDATE oncall              |
|       SET on=false                     SET on=false            |
|     WHERE name='Alice';              WHERE name='Bob';         |
|  COMMIT;  -> OK                   COMMIT;  -> OK               |
|                                                                |
|  Final state: oncall = { } -> INVARIANT VIOLATED               |
|                                                                |
+----------------------------------------------------------------+

The only isolation level that prevents this is SERIALIZABLE. Under SERIALIZABLE, Postgres detects that T1 read rows that T2 wrote, and T2 read rows that T1 wrote — a read-write cycle — and aborts whichever one tries to commit second with a serialization_failure. The application then retries the aborted transaction. On the retry, the new snapshot shows only 1 doctor on call, the IF check fails, and the second doctor is correctly told "you cannot take yourself off."

That is the entire story of SERIALIZABLE in one analogy: snapshots are not enough when two transactions read overlapping data and write into each other's reads. SERIALIZABLE is the level that catches those dangerous patterns and forces one of the offenders to start over.

What SERIALIZABLE Guarantees — The Strongest Promise SQL Makes

The SQL standard defines SERIALIZABLE as: the result of executing a set of concurrent transactions must be equivalent to some serial order of those transactions. In other words, even though the database executes them in parallel, the final state must look as if they had been run one at a time, in some order, with no overlap.

This is the strongest correctness guarantee a relational database offers. It eliminates every read anomaly:

Dirty reads — you cannot see uncommitted data
Non-repeatable reads — the same row read twice returns the same value
Phantom reads — the same range query returns the same set of rows
Write skew — concurrent transactions cannot commit decisions based on data the other modified
Lost updates — two concurrent updates to the same logical state cannot both win

The cost: at least one transaction in any conflicting group will fail at commit time. Your application must be prepared to retry. There is no isolation level that gives you both stronger-than-snapshot correctness and zero retries — the impossibility is fundamental to concurrency control.

+----------------------------------------------------------------+
|           THE SERIALIZABLE GUARANTEE                           |
+----------------------------------------------------------------+
|                                                                |
|  PROMISE:                                                      |
|    The committed outcome of concurrent transactions is         |
|    equivalent to SOME serial execution order.                  |
|                                                                |
|  COST:                                                         |
|    Some transactions will be aborted with serialization_failure |
|    The application must retry them.                           |
|                                                                |
|  PROTECTS AGAINST:                                             |
|    - Dirty reads          (also blocked by READ COMMITTED)     |
|    - Non-repeatable reads (also blocked by REPEATABLE READ)    |
|    - Phantom reads        (also blocked by RR in Postgres)     |
|    - Write skew           (ONLY blocked by SERIALIZABLE)       |
|    - Lost updates         (ONLY fully blocked by SERIALIZABLE) |
|                                                                |
|  DOES NOT PROTECT AGAINST:                                     |
|    - Bugs in your application logic                            |
|    - Network failures between client and DB                    |
|    - Anything that lives outside the transaction               |
|                                                                |
+----------------------------------------------------------------+

How PostgreSQL Implements SERIALIZABLE — Serializable Snapshot Isolation (SSI)

Most databases implement SERIALIZABLE with strict two-phase locking (S2PL): every read takes a shared lock, every write takes an exclusive lock, and locks are held until commit. This is correct but brutal — readers block writers and writers block readers, throughput collapses under contention, and deadlocks become common.

PostgreSQL took a different path. Since version 9.1, Postgres SERIALIZABLE uses Serializable Snapshot Isolation (SSI), a clever algorithm that combines the best of snapshot isolation (no read locks, no blocking) with serializable correctness (no write skew). The key insight: instead of preventing dangerous interleavings with locks, detect them and abort one of the transactions at commit time.

Here is the algorithm in plain English:

Every SERIALIZABLE transaction starts with a snapshot, just like REPEATABLE READ.
As the transaction reads rows, Postgres tracks predicate locks — lightweight markers that record "this transaction read this range of data."
As other transactions write rows, Postgres checks whether any concurrent SERIALIZABLE reader had a predicate lock covering that write. If yes, a read-write dependency is recorded.
At commit time, Postgres checks whether the dependency graph contains a dangerous structure — specifically, a cycle of read-write dependencies between concurrent transactions where one transaction is a "pivot" between two others.
If a dangerous structure exists, Postgres aborts one of the transactions with ERROR: could not serialize access due to read/write dependencies among transactions and SQLSTATE 40001.

+----------------------------------------------------------------+
|           SSI vs S2PL — TWO PATHS TO SERIALIZABLE              |
+----------------------------------------------------------------+
|                                                                |
|  STRICT TWO-PHASE LOCKING (MySQL InnoDB):                      |
|   Read    -> acquire shared lock                               |
|   Write   -> acquire exclusive lock                            |
|   Commit  -> release all locks                                 |
|   Result  -> readers block writers, deadlocks possible         |
|                                                                |
|  SSI (PostgreSQL):                                             |
|   Read    -> take snapshot, record predicate lock              |
|   Write   -> proceed unblocked, mark RW dependency             |
|   Commit  -> check for dangerous structure -> abort or commit  |
|   Result  -> never blocks readers, occasional retries          |
|                                                                |
|  TRADE-OFF:                                                    |
|   S2PL  -> predictable latency, low throughput                 |
|   SSI   -> high throughput, occasional retry storms            |
|                                                                |
+----------------------------------------------------------------+

The headline benefit of SSI: readers never block and writers never block readers. A read-only SERIALIZABLE transaction in Postgres has effectively the same performance as a REPEATABLE READ snapshot read, with the added guarantee that nothing it reads will be invalidated by a concurrent committed write. The cost is paid only at commit time, only on the small fraction of transactions whose read-write dependencies form a dangerous cycle.

Example 1 — Reproducing Write Skew and Catching It with SERIALIZABLE

Let's reproduce the on-call doctor scenario in PostgreSQL. We'll first show that REPEATABLE READ allows the bug, then show that SERIALIZABLE catches it.

-- Setup: two doctors on call
CREATE TABLE oncall (
    name TEXT PRIMARY KEY,
    on_call BOOLEAN NOT NULL
);

INSERT INTO oncall VALUES ('Alice', true), ('Bob', true);

-- Helper: count current doctors on call
SELECT name, on_call FROM oncall;

Sample output:

 name  | on_call
-------+---------
 Alice | t
 Bob   | t
(2 rows)

Now open two psql sessions and run them in lockstep. The left column is Session A (Alice), the right column is Session B (Bob).

-- Session A (Alice)                  -- Session B (Bob)
BEGIN ISOLATION LEVEL REPEATABLE READ; BEGIN ISOLATION LEVEL REPEATABLE READ;

-- A reads the snapshot               -- B reads the same snapshot
SELECT count(*) FROM oncall            SELECT count(*) FROM oncall
 WHERE on_call = true;                  WHERE on_call = true;
-- count = 2                            -- count = 2

-- A decides: 2 >= 2, safe to leave    -- B decides: 2 >= 2, safe to leave
UPDATE oncall SET on_call = false      UPDATE oncall SET on_call = false
 WHERE name = 'Alice';                  WHERE name = 'Bob';

COMMIT;  -- OK                          COMMIT;  -- OK

-- Final state: zero doctors on call. Invariant violated.

Now run the same scenario under SERIALIZABLE. The only change is the isolation level on BEGIN:

-- Reset state
UPDATE oncall SET on_call = true;

-- Session A (Alice)                   -- Session B (Bob)
BEGIN ISOLATION LEVEL SERIALIZABLE;    BEGIN ISOLATION LEVEL SERIALIZABLE;

SELECT count(*) FROM oncall             SELECT count(*) FROM oncall
 WHERE on_call = true;                   WHERE on_call = true;
-- count = 2                             -- count = 2

UPDATE oncall SET on_call = false       UPDATE oncall SET on_call = false
 WHERE name = 'Alice';                   WHERE name = 'Bob';

COMMIT;  -- OK                           COMMIT;
-- ERROR:  could not serialize access
--   due to read/write dependencies
--   among transactions
-- DETAIL:  Reason code: Canceled on
--   identification as a pivot.
-- HINT:  The transaction might
--   succeed if retried.
-- SQLSTATE: 40001

What just happened: SSI noticed that Session B read rows that Session A modified (the count(*) predicate covered Alice's row), and Session A read rows that Session B modified (the predicate covered Bob's row). That is a read-write cycle between concurrent transactions. Postgres aborted whichever one tried to commit second with 40001. On retry, Session B will see only one doctor on call, its count >= 2 check will fail, and it will refuse to remove Bob — exactly the correct behavior.

Example 2 — The serialization_failure Retry Loop (Node.js + pg)

Because SERIALIZABLE transactions can be aborted at commit time, every code path that opens a SERIALIZABLE transaction must be wrapped in a retry loop. This is non-negotiable. Without retries, your application will randomly fail under load with 40001 errors that the user can never reproduce.

The pattern: catch SQLSTATE 40001, wait a tiny randomized backoff, and try again. Cap the retries at 3-5 attempts so a runaway conflict does not loop forever.

// retry-serializable.js
// Wraps any function that performs SERIALIZABLE work in a retry loop
// that catches SQLSTATE 40001 (serialization_failure).
const { Pool } = require('pg');
const pool = new Pool();

const SERIALIZATION_FAILURE = '40001';
const DEADLOCK_DETECTED = '40P01';

async function withSerializableRetry(work, maxAttempts = 5) {
  let attempt = 0;
  while (true) {
    attempt++;
    const client = await pool.connect();
    try {
      // Open a SERIALIZABLE transaction. Every read in here will
      // participate in SSI predicate tracking.
      await client.query('BEGIN ISOLATION LEVEL SERIALIZABLE');
      const result = await work(client);
      await client.query('COMMIT');
      return result;
    } catch (err) {
      // Always roll back on any error to release the snapshot
      await client.query('ROLLBACK').catch(() => {});

      const isRetryable =
        err.code === SERIALIZATION_FAILURE || err.code === DEADLOCK_DETECTED;

      if (!isRetryable || attempt >= maxAttempts) {
        // Either the error is permanent or we ran out of retries
        throw err;
      }

      // Exponential backoff with jitter to avoid thundering herd
      const backoffMs = Math.min(50 * 2 ** (attempt - 1), 500);
      const jitter = Math.random() * backoffMs;
      await new Promise((r) => setTimeout(r, backoffMs + jitter));
      // Loop and try again
    } finally {
      client.release();
    }
  }
}

// Usage: take a doctor off call only if at least one other remains
async function takeOffCall(name) {
  return withSerializableRetry(async (client) => {
    const { rows } = await client.query(
      'SELECT count(*)::int AS n FROM oncall WHERE on_call = true'
    );
    if (rows[0].n < 2) {
      // Application-level rejection — not a serialization issue
      throw new Error('At least one doctor must remain on call');
    }
    await client.query(
      'UPDATE oncall SET on_call = false WHERE name = $1',
      [name]
    );
    return { removed: name };
  });
}

// Concurrent stress test
Promise.all([takeOffCall('Alice'), takeOffCall('Bob')])
  .then((results) => console.log('Results:', results))
  .catch((err) => console.error('Failed:', err.message));

Sample output (one run of many):

Results: [ { removed: 'Alice' }, undefined ]
Failed: At least one doctor must remain on call

What the output shows: one of the two concurrent calls succeeded immediately. The other was aborted by SSI with 40001, the retry loop caught it, restarted the transaction with a fresh snapshot, and on the retry the count(*) returned 1 instead of 2 — so the application correctly rejected the second removal with a business-logic error. The invariant is preserved, no rows were lost, and the user gets a meaningful error message instead of a silent corruption.

Critical rule: the work function must be idempotent on retry. It must not perform side effects outside the transaction (sending emails, calling external APIs, writing to logs that would duplicate). If you need to do those things, do them after the transaction commits, never inside.

PostgreSQL SSI vs MySQL InnoDB SERIALIZABLE — Two Different Animals

Both Postgres and MySQL support SERIALIZABLE, but the implementations are fundamentally different and the operational consequences are not interchangeable.

+----------------------------------------------------------------+
|           POSTGRES SSI vs MYSQL InnoDB SERIALIZABLE            |
+----------------------------------------------------------------+
|                                                                |
|                       PostgreSQL          MySQL InnoDB         |
|                       -----------         -------------        |
|  Algorithm            SSI                 Strict 2PL           |
|  Read locks           No (predicate       Yes (shared S         |
|                       tracking only)      locks on every read) |
|  Readers block        No                  Yes (vs writers)     |
|  Writers block reads  No                  Yes                  |
|  Phantom prevention   Predicate locks     Gap + next-key locks |
|  Failure mode         40001 at COMMIT     Lock wait timeout    |
|                                           or deadlock 40001    |
|  Retry needed         Yes                 Yes                  |
|  Read-only TX cost    Near zero           Significant (S locks) |
|  Best for             Read-heavy OLTP     Lock-friendly OLTP   |
|                                                                |
+----------------------------------------------------------------+

MySQL InnoDB turns every plain SELECT inside a SERIALIZABLE transaction into a SELECT ... LOCK IN SHARE MODE, taking shared row locks (and gap locks for range queries). This blocks any concurrent writer until the reader commits. The result: low concurrency, predictable latency, and frequent lock-wait timeouts under load. InnoDB does not need a separate "serialization failure" detection step because the locks themselves prevent the dangerous interleavings.

PostgreSQL SSI never takes read locks. Read-only transactions run at full snapshot speed. The cost is paid by the small fraction of transactions whose access patterns form dangerous cycles, and that cost is always a 40001 at commit time — never a wait, never a deadlock from reading. This makes Postgres SERIALIZABLE dramatically more scalable for read-heavy workloads, but it forces the application to handle retries.

If you are coming from MySQL and turning on SERIALIZABLE in Postgres for the first time: expect to write retry loops, expect occasional 40001 under contention, and expect performance to be better, not worse, on read-heavy workloads.

The Anomaly Matrix — What Each Isolation Level Actually Prevents

This is the single most important table to memorize for an interview, and the one most engineers get wrong because they confuse the SQL standard with what real engines actually do. Here is the truth for PostgreSQL specifically, since other engines vary.

+------------------------------------------------------------------------+
|         ISOLATION LEVEL vs ANOMALY MATRIX (PostgreSQL)                 |
+------------------------------------------------------------------------+
|                                                                        |
|                    | DIRTY | NON-REP | PHANTOM | WRITE | LOST          |
| ISOLATION LEVEL    | READ  | READ    | READ    | SKEW  | UPDATE        |
|--------------------|-------|---------|---------|-------|---------       |
| READ UNCOMMITTED * | YES   | YES     | YES     | YES   | YES           |
| READ COMMITTED     | NO    | YES     | YES     | YES   | YES           |
| REPEATABLE READ ** | NO    | NO      | NO      | YES   | NO (detected) |
| SERIALIZABLE       | NO    | NO      | NO      | NO    | NO            |
|                                                                        |
| YES = anomaly is POSSIBLE at this level                                |
| NO  = anomaly is PREVENTED at this level                               |
|                                                                        |
| *  Postgres treats READ UNCOMMITTED as READ COMMITTED — there is       |
|    no actual READ UNCOMMITTED behavior in Postgres.                    |
| ** Postgres REPEATABLE READ is actually Snapshot Isolation, which      |
|    is stronger than the SQL standard requires. It blocks phantom       |
|    reads (the standard allows them at this level) but still permits    |
|    write skew (which the standard considers a "snapshot anomaly").    |
|                                                                        |
+------------------------------------------------------------------------+

Key takeaways from the matrix:

READ COMMITTED (the Postgres default) blocks dirty reads only. Everything else can happen. Most apps run here happily because their queries either don't have these anomalies in practice, or they use row locks (SELECT ... FOR UPDATE) at the right moments.
REPEATABLE READ in Postgres is much stronger than the standard requires — it blocks non-repeatable reads, phantom reads, and even "lost update" by detecting concurrent updates to the same row and aborting one with 40001. But it still allows write skew, because two transactions can read disjoint rows and write into each other's reads without touching the same row.
SERIALIZABLE is the only level that blocks write skew, and it does so via SSI rather than locks. The cost is retries on 40001.
There is no isolation level above SERIALIZABLE. If your code is correct under SERIALIZABLE with a retry loop, it is correct full stop.

Choosing an Isolation Level — A Decision Framework

Most production Postgres apps run on READ COMMITTED for a reason: it has the best throughput, the simplest semantics, and most queries do not have anomaly-sensitive logic. The right way to think about this is not "what is the highest level I can afford" but "what is the lowest level my invariants survive at."

+----------------------------------------------------------------+
|           ISOLATION LEVEL DECISION FLOWCHART                   |
+----------------------------------------------------------------+
|                                                                |
|  Q1: Does this transaction READ data and then WRITE based     |
|      on what it read?                                          |
|                                                                |
|       NO  -> READ COMMITTED is fine                            |
|       YES -> continue to Q2                                    |
|                                                                |
|  Q2: Could two concurrent runs of this logic, each working     |
|      on a stale snapshot, BOTH commit successfully and         |
|      together violate an invariant?                            |
|                                                                |
|       NO  -> READ COMMITTED + SELECT FOR UPDATE on the row     |
|              you depend on                                     |
|       YES -> continue to Q3                                    |
|                                                                |
|  Q3: Are the rows being read and written easy to identify     |
|      individually so you could lock them with SELECT FOR       |
|      UPDATE?                                                   |
|                                                                |
|       YES -> READ COMMITTED + explicit row locks               |
|       NO  -> SERIALIZABLE + retry loop  (the right answer      |
|              when the invariant spans an aggregate or set)     |
|                                                                |
+----------------------------------------------------------------+

The on-call doctor scenario is the textbook case for SERIALIZABLE because the invariant ("at least one on call") is an aggregate over many rows. There is no single row to lock with FOR UPDATE. You would have to lock the whole table, which kills concurrency. SERIALIZABLE with SSI gives you the same correctness with fingertip-tap concurrency cost.

Conversely, "do not let two users buy the last copy of a book" is a textbook case for SELECT ... FOR UPDATE on the book row at READ COMMITTED. There is exactly one row that matters, and locking it is cheap and obvious.

Napkin AI Visual Prompt: "Dark navy gradient (#0a0f1f -> #111a2e). Center: a four-row vertical ladder labeled (top to bottom) 'READ UNCOMMITTED', 'READ COMMITTED', 'REPEATABLE READ', 'SERIALIZABLE'. Each rung shows a sky blue (#4fc3f7) shield with the anomalies it blocks listed in white monospace, and a rose (#ff5c8a) crack icon for anomalies it still allows. The SERIALIZABLE rung at the bottom has a glowing gold halo and the label 'SSI in Postgres'. To the right, a small white monospace code card showing 'BEGIN ISOLATION LEVEL SERIALIZABLE' followed by 'catch (40001) -> RETRY' with a sky blue circular arrow. Title at top: 'Choose Your Level' in white monospace. Subtle grid pattern overlay."

Common Mistakes

1. Using SERIALIZABLE without a retry loop. This is the most common SERIALIZABLE bug. Engineers turn on SERIALIZABLE because they read a blog post saying it is "safer," then ship code that does not catch 40001. Under any meaningful contention, transactions start failing randomly, the user sees a 500 error, and the team blames the database. The fix is to wrap every SERIALIZABLE transaction in a retry loop that catches SQLSTATE 40001 and 40P01 (deadlock), with exponential backoff and a small max attempt count. If you cannot guarantee the retry loop, do not use SERIALIZABLE.

2. Performing external side effects inside a SERIALIZABLE transaction. Sending an email, calling Stripe, or writing to a log file inside a transaction that may be retried means the side effect happens multiple times. The transaction sees no problem because rolling back the database does not roll back the email. The fix is to perform external side effects only after the transaction commits, ideally via an outbox table that another process drains. The transaction writes the intent; the post-commit worker performs the side effect exactly once.

3. Confusing Postgres REPEATABLE READ with the SQL standard's REPEATABLE READ. The SQL standard says REPEATABLE READ allows phantom reads. In Postgres, REPEATABLE READ is implemented as snapshot isolation, which does not allow phantom reads. Engineers who learned the standard from a textbook then write Postgres code assuming phantom reads are possible at REPEATABLE READ, add unnecessary locks, and complain about performance. The fix is to know your engine: in Postgres, REPEATABLE READ blocks phantoms, but it still allows write skew.

4. Assuming SERIALIZABLE locks like MySQL does. Engineers coming from MySQL InnoDB expect SERIALIZABLE to block readers and writers via shared locks. They turn it on in Postgres expecting performance to crater and are surprised when it does not — but they also do not write retry loops because "InnoDB never needed them." Then 40001 errors start appearing in production logs. The fix is to internalize that Postgres SERIALIZABLE is SSI, not 2PL: it never blocks, but it always retries.

5. Using SERIALIZABLE for everything, including read-only queries. Read-only SERIALIZABLE transactions in Postgres are nearly free — but only if you actually mark them read-only with BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY. This tells the planner the transaction will not write, which lets SSI skip predicate-lock bookkeeping and never abort the transaction with 40001. Forgetting the READ ONLY clause means even your reporting queries can be aborted by SSI conflicts.

Interview Questions

1. "Postgres REPEATABLE READ already blocks phantom reads. Why would you ever use SERIALIZABLE?"

Because REPEATABLE READ in Postgres is snapshot isolation, and snapshot isolation does not prevent write skew. Write skew is the anomaly where two concurrent transactions each read a consistent snapshot, each make a decision based on that snapshot, and each commit a write that — combined — violates an invariant the snapshot promised. The classic example is two on-call doctors each removing themselves from the rotation after independently checking that there are still two doctors on call. Each transaction is correct in isolation, but the combined effect leaves zero doctors on call. Snapshot isolation cannot detect this because the two transactions never touch the same row. SERIALIZABLE detects it via Serializable Snapshot Isolation (SSI), which tracks read-write dependencies between concurrent transactions and aborts the offender at commit time with serialization_failure. If your invariant spans an aggregate or a set rather than a single row, SERIALIZABLE is the only correct choice — SELECT FOR UPDATE cannot save you because there is no single row to lock.

2. "Explain how Postgres implements SERIALIZABLE without taking read locks."

Postgres uses Serializable Snapshot Isolation (SSI), an algorithm introduced in 9.1. Every SERIALIZABLE transaction starts with a snapshot exactly like REPEATABLE READ — readers do not block, writers do not block readers. The difference is bookkeeping: as the transaction reads, Postgres records lightweight predicate locks marking which ranges of data the transaction depends on. As other transactions write, Postgres checks whether any concurrent SERIALIZABLE reader had a predicate covering the new write, and if so it records a read-write dependency. At commit time, Postgres looks at the dependency graph for a "dangerous structure" — specifically a cycle of read-write edges between concurrent transactions where one transaction sits as a pivot between two others. If a dangerous structure is found, Postgres aborts one of the transactions with SQLSTATE 40001. The application is expected to retry. The practical outcome is that read-only workloads pay almost zero cost, write-heavy workloads pay a small commit-time cost, and the application gets full serializable correctness without the throughput collapse of strict two-phase locking.

3. "Walk me through a serialization_failure retry loop. What can go wrong?"

The pattern is: open a transaction at SERIALIZABLE, run the work function, commit, and on any error check whether the SQLSTATE is 40001 or 40P01 — if so, roll back, sleep for a randomized backoff, and try again up to a small maximum (typically 3-5 attempts). The work function must be idempotent on retry because it will run more than once. The biggest things that go wrong: (1) performing side effects outside the database inside the work function, like sending an email or calling Stripe — these execute multiple times and cannot be undone by ROLLBACK. The fix is to write side-effect intent to an outbox table inside the transaction and let a separate worker perform the side effect after commit. (2) Forgetting the max-attempt cap, which lets a hot conflict loop forever and starve the worker pool. (3) Using a constant backoff instead of jitter, which causes thundering-herd retries that immediately re-conflict. (4) Catching 40001 but not 40P01, which means deadlocks — semantically the same problem — leak through. (5) Reusing connection state from the failed transaction; you must ROLLBACK before retrying or the session is left in a broken state.

4. "What is the difference between PostgreSQL SERIALIZABLE and MySQL InnoDB SERIALIZABLE?"

PostgreSQL uses Serializable Snapshot Isolation (SSI), which never takes read locks. Reads run at snapshot speed and the only cost is occasional serialization_failure (SQLSTATE 40001) at commit time when SSI detects a dangerous read-write dependency cycle. The application must wrap SERIALIZABLE work in a retry loop. Read-heavy workloads scale almost as well as snapshot isolation. MySQL InnoDB uses strict two-phase locking (S2PL) — every plain SELECT inside a SERIALIZABLE transaction is implicitly upgraded to SELECT ... LOCK IN SHARE MODE, taking shared row locks plus gap locks for range queries. Readers block writers and writers block readers, throughput collapses under contention, and you see lock-wait timeouts and deadlocks instead of clean SSI aborts. The operational consequence: code that runs cleanly under MySQL SERIALIZABLE may explode under load when you naively port it to Postgres SERIALIZABLE without adding retry logic, and vice versa, code that scales beautifully under Postgres SERIALIZABLE may grind to a halt under MySQL because every read is now a lock.

5. "When is READ COMMITTED enough, when do you need SELECT FOR UPDATE, and when do you actually need SERIALIZABLE?"

READ COMMITTED is enough when the transaction does not make a decision based on the data it read — pure inserts, blind updates by primary key, simple lookups for display. It is the Postgres default for good reason: it has the best throughput and the simplest semantics. READ COMMITTED plus SELECT ... FOR UPDATE is the right choice when the invariant depends on a single identifiable row: "do not sell more copies than we have in stock" locks the inventory row, "do not double-charge a user" locks the user row, "do not double-book a meeting room" locks the room row. This is cheap, obvious, and correct. SERIALIZABLE is the right choice when the invariant depends on an aggregate or a set of rows that you cannot easily lock individually — "at least one doctor on call," "balanced double-entry ledger across many accounts," "no two overlapping reservations across a date range." In those cases there is no single row to lock and trying to lock everything kills concurrency. SERIALIZABLE plus a retry loop gives you the guarantee with minimal overhead under SSI. The decision rule: ask whether two concurrent runs of the same logic, each on a stale snapshot, could each commit and together break the invariant. If yes and the affected rows are identifiable, lock them. If yes and they are not, use SERIALIZABLE.

Quick Reference — Cheat Sheet

+------------------------------------------------------------------------+
|         POSTGRESQL ISOLATION LEVEL ANOMALY MATRIX                      |
+------------------------------------------------------------------------+
|                                                                        |
|                    | DIRTY | NON-REP | PHANTOM | WRITE | LOST          |
| LEVEL              | READ  | READ    | READ    | SKEW  | UPDATE        |
|--------------------|-------|---------|---------|-------|---------------|
| READ UNCOMMITTED   | NO *  | YES     | YES     | YES   | YES           |
| READ COMMITTED     | NO    | YES     | YES     | YES   | YES           |
| REPEATABLE READ    | NO    | NO      | NO **   | YES   | NO (40001)    |
| SERIALIZABLE       | NO    | NO      | NO      | NO    | NO            |
|                                                                        |
| *  Postgres treats READ UNCOMMITTED as READ COMMITTED                  |
| ** Postgres RR is snapshot isolation -> stronger than SQL standard     |
|                                                                        |
+------------------------------------------------------------------------+

+------------------------------------------------------------------------+
|         SERIALIZABLE QUICK FACTS (PostgreSQL)                          |
+------------------------------------------------------------------------+
|                                                                        |
|  ALGORITHM:    Serializable Snapshot Isolation (SSI), since 9.1        |
|  READ LOCKS:   None (predicate tracking only)                          |
|  WRITER WAIT:  Never blocks readers                                    |
|  FAILURE:      ERROR 40001 at COMMIT (not at SELECT)                   |
|  RETRY:        REQUIRED in application code                            |
|  READ-ONLY:    Use BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY        |
|                to opt out of predicate tracking entirely               |
|                                                                        |
+------------------------------------------------------------------------+

+------------------------------------------------------------------------+
|         RETRY LOOP SKELETON                                            |
+------------------------------------------------------------------------+
|                                                                        |
|  for attempt in 1..MAX:                                                |
|      BEGIN ISOLATION LEVEL SERIALIZABLE                                |
|      try:                                                              |
|          run work()                                                    |
|          COMMIT                                                        |
|          return                                                        |
|      catch SQLSTATE 40001 or 40P01:                                    |
|          ROLLBACK                                                      |
|          sleep(jittered backoff)                                       |
|          continue                                                      |
|      catch other:                                                      |
|          ROLLBACK                                                      |
|          raise                                                         |
|  raise "exceeded retries"                                              |
|                                                                        |
+------------------------------------------------------------------------+

+------------------------------------------------------------------------+
|         POSTGRES vs MYSQL SERIALIZABLE                                 |
+------------------------------------------------------------------------+
|                                                                        |
|                       PostgreSQL          MySQL InnoDB                 |
|                       -----------         -------------                |
|  Mechanism            SSI (no locks)      Strict 2PL (shared locks)    |
|  Reader blocks?       No                  Yes                          |
|  Writer blocks read?  No                  Yes                          |
|  Failure mode         40001 at COMMIT     Lock timeout / deadlock      |
|  Retry needed?        Yes                 Yes                          |
|  Read-heavy cost      Near zero           Significant                  |
|                                                                        |
+------------------------------------------------------------------------+

Concern	Wrong Way	Right Way
Aggregate invariant	REPEATABLE READ + check	SERIALIZABLE + retry loop
Single-row invariant	SERIALIZABLE everywhere	READ COMMITTED + SELECT FOR UPDATE
Side effects in TX	Send email inside BEGIN	Outbox row inside, worker after commit
Catching 40001	Bubble up as 500	Retry with jittered backoff
Read-only reports	Plain SERIALIZABLE	SERIALIZABLE READ ONLY
MySQL mental model	"Locks block readers"	Postgres SSI never blocks reads
Retry forever	No max attempts	Cap at 3-5, then fail loudly
Snapshot = safe	Trust the snapshot	Snapshot is not serializable

Prev: Lesson 11.5 -- Phantom Read and REPEATABLE READ Next: Lesson 12.1 -- Views and Materialized Views

This is Lesson 11.6 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

SERIALIZABLE and Isolation Level Comparison