ACID Properties
Atomicity, Consistency, Isolation, and Durability
LinkedIn Hook
"Your bank transfer debited the sender, the server crashed, and the receiver never got the money. Where did the dollars go?"
Every database textbook throws four letters at you on day one -- A, C, I, D -- and most engineers can recite what they stand for. Far fewer can explain how Postgres actually keeps a transfer from vanishing when the power cord gets kicked out of the wall, or why MySQL InnoDB writes to two different logs before it touches the real table file.
ACID is not an abstract guarantee. It is a concrete contract built out of write-ahead logs, fsync calls, undo records, MVCC snapshots, and checkpoints. When you understand the machinery, you stop being scared of transactions and start using them as a precision tool: you know exactly which property protects you from which failure, and exactly what it costs in latency and disk I/O.
In Lesson 11.2 of the Database Interview Prep Course, I break down each ACID property in plain English, show how PostgreSQL implements it (WAL, MVCC, checkpoints, fsync), contrast it with MySQL InnoDB, and walk through runnable SQL examples that demonstrate atomicity, consistency, isolation, and durability in action.
Read the full lesson -> [link]
#PostgreSQL #MySQL #Databases #ACID #Transactions #BackendDevelopment #InterviewPrep #SQL
What You'll Learn
- What each letter of ACID actually guarantees (and what it does NOT guarantee)
- How PostgreSQL implements atomicity using the Write-Ahead Log (WAL) and undo via MVCC
- How consistency is enforced through constraints, triggers, and the application contract
- How isolation works under MVCC snapshots, and how it differs from locking-based databases
- How durability is achieved with WAL flushes, fsync, and checkpoints
- How MySQL InnoDB implements the same properties using redo log, undo log, and the doublewrite buffer
- The performance trade-offs of each guarantee and the knobs you can turn
- Runnable PostgreSQL examples that demonstrate each property crashing, recovering, and isolating
The Bank Vault Analogy -- Why ACID Is a Contract, Not a Feature
Imagine a bank teller processing a wire transfer. The customer hands over a slip that says "move $500 from account A to account B." The teller has to do four things at once, and any one of them failing has to look like none of them happened.
First, the teller must treat the transfer as a single bundle. Either both accounts change or neither does. There is no acceptable universe where account A loses $500 but account B never receives it -- not because of a bug, not because the teller's pen ran out of ink, not because the lights went out. That is atomicity.
Second, the bank has rules: account balances cannot go negative, the total money in the bank must equal the sum of all account balances, every transaction must reference a real customer. The teller cannot complete a transfer that violates any of these rules, even by accident. That is consistency.
Third, while this teller is in the middle of the transfer, another teller might be processing a withdrawal from the same account. The two tellers must not see each other's half-finished work -- account A should never appear to be debited but not yet credited from anyone else's perspective. Each transfer must look as if it happened in a soundproof booth. That is isolation.
Fourth, once the teller stamps the slip "COMPLETE" and the customer walks away, that transfer must survive everything: a power outage, a server fire, a clumsy DBA, a kernel panic. Even if the bank is destroyed five seconds later, when it reopens the next morning the transfer must still be there. That is durability.
ACID is not a feature your database happens to have. It is the contract between your application and the database: you wrap work in BEGIN ... COMMIT, and the database promises all four guarantees in exchange. Understanding the mechanism behind each letter is what separates engineers who fear transactions from engineers who use them with confidence.
+---------------------------------------------------------------+
| THE ACID CONTRACT |
+---------------------------------------------------------------+
| |
| A - ATOMICITY |
| All statements in a transaction succeed, or none do. |
| Mechanism: WAL + undo (MVCC old row versions) |
| |
| C - CONSISTENCY |
| The database moves from one valid state to another. |
| Mechanism: constraints + triggers + your app logic |
| |
| I - ISOLATION |
| Concurrent transactions do not see each other's mess. |
| Mechanism: MVCC snapshots, predicate locks, SSI |
| |
| D - DURABILITY |
| Once COMMIT returns, the change survives any crash. |
| Mechanism: WAL fsync + checkpoints + replication |
| |
+---------------------------------------------------------------+
A -- Atomicity: All or Nothing
Atomicity says that a transaction is indivisible. If your transaction has ten statements and the ninth one throws a constraint violation, the database must roll back the first eight as if they never happened. The application never sees a partial state -- not on disk, not in another session's query, not after a crash.
How PostgreSQL Implements Atomicity
PostgreSQL achieves atomicity using a combination of the Write-Ahead Log (WAL) and MVCC (Multi-Version Concurrency Control).
When you run an UPDATE, PostgreSQL does NOT overwrite the existing row. Instead, it writes a new version of the row with a transaction id (xmin) marking it as created by your transaction, and marks the old version with xmax = your_txid meaning "deleted by your transaction." Both versions sit in the table at the same time. Other transactions, looking at their own snapshot, see whichever version was visible at their start.
Before any of this hits the table file, PostgreSQL writes a record describing the change to the WAL. The WAL record is the source of truth. If you COMMIT, PostgreSQL writes a single tiny "commit" record to WAL and flushes it. If you ROLLBACK -- or if the server crashes mid-transaction -- the new row versions still exist in the table, but no commit record was ever written. On the next read, MVCC sees that xmin belongs to an aborted (or never-committed) transaction and skips those rows entirely. They are invisible. Eventually, VACUUM reclaims the space.
+---------------------------------------------------------------+
| ATOMICITY VIA WAL + MVCC |
+---------------------------------------------------------------+
| |
| 1. BEGIN -> assign txid 4271 |
| 2. UPDATE accounts -> write WAL record |
| insert new row version (xmin=4271) |
| mark old row (xmax=4271) |
| 3. UPDATE accounts -> write WAL record |
| (second row, same pattern) |
| 4a. COMMIT -> write commit record, fsync WAL |
| Result: txid 4271 marked committed in CLOG |
| Both new versions become visible to everyone. |
| |
| 4b. ROLLBACK / CRASH -> no commit record exists |
| Result: txid 4271 marked aborted in CLOG |
| New row versions exist but are dead -> VACUUM cleans. |
| |
+---------------------------------------------------------------+
MySQL InnoDB does it differently. Instead of leaving old row versions in the table, it overwrites the row in place and writes the previous value to the undo log. If you roll back, InnoDB walks the undo log and restores the original values. Same guarantee, different machinery.
Example 1 -- Atomicity in Action
-- Setup: a tiny bank with two accounts
CREATE TABLE accounts (
id INT PRIMARY KEY,
owner TEXT NOT NULL,
balance NUMERIC(12, 2) NOT NULL CHECK (balance >= 0)
);
INSERT INTO accounts VALUES
(1, 'Alice', 1000.00),
(2, 'Bob', 500.00);
-- A transfer that violates the CHECK constraint halfway through.
-- Atomicity guarantees the first UPDATE is rolled back.
BEGIN;
UPDATE accounts SET balance = balance - 1500 WHERE id = 1;
-- This statement fails: balance would become -500, violating CHECK
-- Postgres aborts the statement and marks the transaction as failed
UPDATE accounts SET balance = balance + 1500 WHERE id = 2;
COMMIT;
-- Verify nothing changed
SELECT * FROM accounts ORDER BY id;
Sample output:
ERROR: new row for relation "accounts" violates check constraint "accounts_balance_check"
DETAIL: Failing row contains (1, Alice, -500.00).
id | owner | balance
----+-------+---------
1 | Alice | 1000.00
2 | Bob | 500.00
(2 rows)
Notice that even though the first UPDATE "succeeded" inside the transaction, the failure of the second statement left the entire transaction in an aborted state. The COMMIT became a no-op. Alice's balance is untouched. That is atomicity: zero or all, never half.
C -- Consistency: The Rules Are Always True
Consistency says that a transaction takes the database from one valid state to another. "Valid" means every constraint, every trigger, every foreign key, and every application-level invariant is satisfied at commit time.
This is the most misunderstood letter of ACID. The database does not know what your application considers "valid" -- it only knows what you told it via constraints. Consistency is therefore a shared responsibility:
- The database enforces what you declare:
NOT NULL,CHECK,UNIQUE,FOREIGN KEY,EXCLUDE, triggers, and assertions. - Your application enforces invariants the database cannot express: "the sum of all account balances equals the bank's reserve" or "an order's total matches the sum of its line items."
If you do not declare a constraint, the database will happily accept a state your application considers invalid -- and that is not a violation of ACID. Consistency means "no committed transaction leaves the database in a state that violates any declared rule."
How PostgreSQL Enforces Consistency
PostgreSQL checks declared constraints at one of two times:
- Immediate constraints are checked at the end of each statement (the default).
- Deferred constraints (
DEFERRABLE INITIALLY DEFERRED) are checked only atCOMMIT. This is essential for things like circular foreign keys or chicken-and-egg invariants where the intermediate state is temporarily invalid.
+---------------------------------------------------------------+
| CONSISTENCY -- WHO ENFORCES WHAT |
+---------------------------------------------------------------+
| |
| DATABASE'S JOB (declarative): |
| - NOT NULL, UNIQUE, PRIMARY KEY |
| - CHECK (column-level invariants) |
| - FOREIGN KEY (referential integrity) |
| - EXCLUDE (Postgres: range / spatial overlap rules) |
| - Triggers (cross-row, cross-table rules) |
| |
| APPLICATION'S JOB (procedural): |
| - Multi-table invariants the schema cannot express |
| - Workflow rules ("only managers can approve") |
| - Invariants checked inside the same transaction |
| |
| ACID GUARANTEE: |
| If you declare it, the DB enforces it atomically with |
| the rest of the transaction. If you don't declare it, |
| it isn't ACID -- it's just hope. |
| |
+---------------------------------------------------------------+
Example 2 -- Consistency via Constraints and Deferred Checks
-- A schema where every order line must reference a real order,
-- and every order must have at least one line item.
CREATE TABLE orders (
id INT PRIMARY KEY,
customer TEXT NOT NULL,
total NUMERIC(10, 2) NOT NULL DEFAULT 0
);
CREATE TABLE order_lines (
id INT PRIMARY KEY,
order_id INT NOT NULL REFERENCES orders(id),
product TEXT NOT NULL,
amount NUMERIC(10, 2) NOT NULL CHECK (amount > 0)
);
-- Try to insert a line for an order that does not exist.
-- Consistency (foreign key) blocks it.
BEGIN;
INSERT INTO order_lines VALUES (1, 999, 'Widget', 19.99);
COMMIT;
Sample output:
ERROR: insert or update on table "order_lines" violates foreign key constraint "order_lines_order_id_fkey"
DETAIL: Key (order_id)=(999) is not present in table "orders".
Now the deferred case -- a self-referencing parent/child where neither row can exist without the other:
CREATE TABLE employees (
id INT PRIMARY KEY,
name TEXT NOT NULL,
manager_id INT REFERENCES employees(id) DEFERRABLE INITIALLY DEFERRED
);
-- Without DEFERRABLE this would fail: row 1 references row 2
-- which does not exist yet at the moment of the first INSERT.
BEGIN;
INSERT INTO employees VALUES (1, 'Alice', 2);
INSERT INTO employees VALUES (2, 'Bob', 1);
COMMIT;
SELECT * FROM employees ORDER BY id;
Sample output:
id | name | manager_id
----+-------+------------
1 | Alice | 2
2 | Bob | 1
(2 rows)
The intermediate state (row 1 pointing at a non-existent row 2) is temporarily inconsistent, but at COMMIT time both rows exist and the foreign key holds. The database checked at the right moment.
I -- Isolation: Concurrent Transactions Do Not Interfere
Isolation says that even though many transactions run at the same time, each one should appear as if it ran by itself. The SQL standard defines four isolation levels, in order of strictness:
READ UNCOMMITTED-- a transaction can see another transaction's uncommitted changes (dirty reads). Lessons 11.3 and beyond cover this in depth.READ COMMITTED-- a transaction only sees committed data, but two reads of the same row in the same transaction can return different values.REPEATABLE READ-- a transaction sees a consistent snapshot for its entire duration; repeated reads return the same data.SERIALIZABLE-- the strongest level; the result of any concurrent execution is equivalent to some serial execution.
PostgreSQL's default is READ COMMITTED. Crucially, PostgreSQL never offers true READ UNCOMMITTED -- if you ask for it, you get READ COMMITTED anyway, because MVCC simply has no mechanism to expose uncommitted data.
How PostgreSQL Implements Isolation -- MVCC Snapshots
When a transaction starts (or its first statement runs, depending on the level), PostgreSQL takes a snapshot: a list of which transaction ids were committed at that instant. From then on, every row the transaction reads is filtered through that snapshot. A row is visible only if xmin was committed before the snapshot and xmax was not.
This is why Postgres does not need read locks: readers never block writers, and writers never block readers. Instead of locking, the database lets each session live in its own consistent timeline.
+---------------------------------------------------------------+
| MVCC SNAPSHOTS -- HOW POSTGRES ISOLATES |
+---------------------------------------------------------------+
| |
| Time Session A Session B |
| ---- ---------------------- --------------------- |
| t0 BEGIN; (snapshot S_A) |
| t1 BEGIN; UPDATE row 1 |
| t2 SELECT row 1 COMMIT; |
| -> sees OLD value |
| (xmax of old row was set |
| by a txid not in S_A, |
| so old version still visible) |
| t3 SELECT row 1 |
| -> still sees OLD value |
| (REPEATABLE READ) |
| |
| At REPEATABLE READ, A's snapshot never changes for the |
| life of the transaction. At READ COMMITTED, A would take |
| a NEW snapshot per statement and see B's committed value. |
| |
+---------------------------------------------------------------+
SERIALIZABLE in PostgreSQL is implemented via Serializable Snapshot Isolation (SSI): the database tracks read/write dependencies between concurrent transactions and aborts one with a serialization_failure error if it detects a cycle that could produce a non-serializable result. The application is expected to retry.
MySQL InnoDB also uses MVCC, but its default isolation level is REPEATABLE READ (not READ COMMITTED), and its implementation relies more heavily on next-key locks (a combination of row lock and gap lock) to prevent phantom reads. InnoDB's SERIALIZABLE implicitly converts every SELECT into SELECT ... FOR SHARE, taking real locks rather than tracking dependencies.
Example 3 -- Isolation Across Two Sessions
This example needs two psql sessions running side by side. Open both before you begin.
-- Setup (run once in either session)
CREATE TABLE products (
id INT PRIMARY KEY,
name TEXT NOT NULL,
stock INT NOT NULL
);
INSERT INTO products VALUES (1, 'Coffee', 100);
Session A (REPEATABLE READ -- snapshot is fixed at BEGIN):
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT stock FROM products WHERE id = 1;
-- Output: 100
Session B (modifies the row and commits):
BEGIN;
UPDATE products SET stock = stock - 30 WHERE id = 1;
COMMIT;
Session A (still inside the same transaction):
SELECT stock FROM products WHERE id = 1;
-- Output: 100 -- A still sees its snapshot, NOT B's committed change
COMMIT;
SELECT stock FROM products WHERE id = 1;
-- Output: 70 -- A new statement, new snapshot, B's change visible
Sample output (combined):
-- Session A, first SELECT
stock
-------
100
(1 row)
-- Session A, second SELECT (still inside transaction)
stock
-------
100
(1 row)
-- Session A, after COMMIT
stock
-------
70
(1 row)
Session A lived in its own timeline for the duration of the transaction. That is isolation, implemented entirely without read locks.
D -- Durability: Once Committed, Always Committed
Durability says that once COMMIT returns successfully to the client, the transaction's effects must survive any subsequent failure: a process crash, an OS panic, a power loss, even a clean shutdown followed by a hardware swap. The committed data must come back when the database starts up again.
How PostgreSQL Implements Durability -- WAL, fsync, and Checkpoints
The mechanism behind durability is the same Write-Ahead Log we already met for atomicity. The rule is brutally simple: the WAL record describing a change must be on disk before the change itself is allowed to reach the table file on disk. This is the "write-ahead" in Write-Ahead Log.
When you COMMIT:
- PostgreSQL writes a commit record to the WAL buffer in memory.
- It calls
fsync()(or its equivalent, controlled bywal_sync_method) to force the WAL file from the OS page cache all the way down to physical storage. - Only after fsync returns does the server send "COMMIT" back to the client.
If the server crashes the moment after step 3, the modified table pages may still be sitting in shared buffers, never flushed to disk. That is fine. On restart, PostgreSQL reads the WAL from the last checkpoint forward, finds the commit record, and replays every change since then -- recovery rebuilds the table state from the log.
A checkpoint is the periodic operation that flushes all dirty shared buffer pages to the table files and writes a checkpoint record to the WAL. After a checkpoint, the WAL up to that point is no longer needed for recovery (though it may still be retained for archive or replication). Checkpoints exist purely to bound recovery time -- the more recent the last checkpoint, the less WAL has to be replayed after a crash.
+---------------------------------------------------------------+
| DURABILITY -- WAL FLUSH + CHECKPOINT |
+---------------------------------------------------------------+
| |
| COMMIT path: |
| 1. Build WAL record in memory |
| 2. fsync WAL file -> physical disk |
| 3. Return "COMMIT" to client |
| |
| Background: |
| - Checkpointer flushes dirty table pages periodically |
| - Bgwriter trickles dirty pages to OS cache |
| - WAL files older than last checkpoint may be recycled |
| |
| Crash recovery: |
| 1. Find last checkpoint record in WAL |
| 2. Replay every WAL record after that point |
| 3. Roll back any in-progress (uncommitted) transactions |
| 4. Database is open for connections |
| |
| The fsync at COMMIT is the entire reason durability works. |
| Turning it off (synchronous_commit = off) gains throughput |
| but means a crash can lose committed transactions. |
| |
+---------------------------------------------------------------+
MySQL InnoDB uses the same pattern with different names: the redo log (equivalent to WAL) is flushed at commit, and the doublewrite buffer protects against torn page writes (where a 16KB page is half-written when the OS crashes). The setting innodb_flush_log_at_trx_commit controls whether redo log is fsynced on every commit (1 = ACID-safe), once per second (2 = fast but loses up to 1 second on crash), or never (0 = fastest but loses on any crash).
Example 4 -- Demonstrating Durability via WAL
You cannot easily fake a power failure inside a SQL session, but you can inspect the WAL machinery and the settings that govern durability.
-- Show the current WAL position. This advances every time
-- a COMMIT writes a record.
SELECT pg_current_wal_lsn();
-- Run a transaction
BEGIN;
CREATE TABLE durability_demo (id INT, payload TEXT);
INSERT INTO durability_demo VALUES (1, 'Survives a crash');
COMMIT;
-- Show that the WAL position has advanced
SELECT pg_current_wal_lsn();
-- The setting that makes durability real
SHOW synchronous_commit;
SHOW fsync;
SHOW wal_level;
Sample output:
pg_current_wal_lsn
--------------------
0/1A2C3D8
(1 row)
CREATE TABLE
INSERT 0 1
COMMIT
pg_current_wal_lsn
--------------------
0/1A2C580
(1 row)
synchronous_commit
--------------------
on
(1 row)
fsync
-------
on
(1 row)
wal_level
-----------
replica
(1 row)
The WAL LSN (Log Sequence Number) advanced between the two queries -- that is the physical proof that PostgreSQL recorded your transaction in a place that will survive a crash. With synchronous_commit = on and fsync = on, you have a strong durability guarantee. If you crashed the server the instant after the COMMIT returned and brought it back up, the durability_demo table would still be there with the row intact, because recovery would replay the WAL records from the last checkpoint.
If you turn synchronous_commit off, COMMIT returns immediately after writing to the in-memory WAL buffer, without waiting for fsync. Throughput skyrockets, but a crash within the next few hundred milliseconds can lose committed transactions. That is a deliberate trade-off, sometimes acceptable for non-critical workloads -- but it is no longer ACID in the strict sense.
Napkin AI Visual Prompt: "Dark gradient (#0a0f1f -> #111a2e). Four-panel diagram, one per ACID letter. Panel A (Atomicity): a chain link with one broken link, cyan glow, label 'BEGIN ... COMMIT or nothing'. Panel C (Consistency): a balance scale weighing CHECK, FOREIGN KEY, UNIQUE on one side and 'valid state' on the other, pink accent. Panel I (Isolation): two soundproof booths labeled Session A and Session B, each with its own snapshot bubble showing different row values, cyan arrows. Panel D (Durability): a stone tablet etched with WAL records, with a fsync arrow pointing to a disk icon, pink glow. Center label: 'ACID = WAL + MVCC + Constraints + fsync'. White monospace text throughout."
Common Mistakes
1. Confusing consistency (the C in ACID) with eventual consistency in distributed systems. The ACID "C" means "constraint-preserving" -- transactions move the database from one valid state to another according to declared rules. The "C" in CAP theorem (the distributed-systems one) means "all nodes see the same data at the same time." They are unrelated concepts that share a letter. Mixing them up in an interview is one of the fastest ways to look unprepared.
2. Believing the database enforces all your invariants automatically.
ACID's consistency only covers what you declared -- CHECK, FOREIGN KEY, NOT NULL, UNIQUE, triggers. If your application invariant is "the sum of all line items must equal the order total" and you never wrote a trigger or a check, the database will happily save inconsistent data and not violate ACID at all. Consistency is a contract; you have to write down the terms.
3. Assuming SERIALIZABLE means transactions run one at a time.
SERIALIZABLE means the result is equivalent to some serial order, not that execution is serial. PostgreSQL's SSI runs transactions concurrently and only aborts the loser when a non-serializable cycle is detected. You must catch serialization_failure errors and retry. If you do not retry, you have not implemented serializable isolation -- you have implemented "occasionally aborts and dies."
4. Turning off fsync or synchronous_commit for "performance" without understanding the cost.
fsync = off and synchronous_commit = off give large throughput gains by skipping or deferring the disk flush at commit. They also throw away durability. A crash will lose committed transactions and can leave the database in an inconsistent state requiring a full restore from backup. Acceptable for ephemeral analytics; catastrophic for anything you care about. Know which knob you are turning.
5. Thinking MVCC means PostgreSQL never locks anything.
MVCC eliminates read-write locks: readers do not block writers and vice versa. But two transactions writing the same row still take a row-level write lock, and SELECT ... FOR UPDATE takes one explicitly. MVCC is not a replacement for locking -- it is a way of avoiding locks when they are not needed.
Interview Questions
1. "Walk me through what happens inside PostgreSQL when I run BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; COMMIT;. What guarantees ACID at each step?"
BEGIN allocates a new transaction id (xid) and, depending on the isolation level, may take a snapshot of currently-committed transactions. The UPDATE finds row id 1 in the heap, writes a WAL record describing the change, then in the table file inserts a brand-new row version with xmin set to your xid and the updated balance, while the old row version's xmax is set to your xid (marking it as deleted-by-you). No other session sees the new version yet because your xid is not in their snapshots. At COMMIT, PostgreSQL writes a commit record to the WAL buffer, calls fsync() to push the WAL to physical disk, and updates the CLOG (commit log) entry for your xid to "committed." Only then does it return success to the client. Atomicity is guaranteed by the fact that nothing is visible until the commit record is written; if the server crashed between the UPDATE and COMMIT, recovery would mark your xid as aborted and the new row version would be invisible. Durability is guaranteed by the fsync. Isolation is guaranteed by MVCC snapshots filtering which row versions other sessions see. Consistency is enforced by the CHECK (balance >= 0) constraint, which the database evaluates inside the UPDATE before writing the new row version.
2. "What is the difference between PostgreSQL's MVCC implementation and MySQL InnoDB's, and why does it matter?"
PostgreSQL stores every row version in the heap itself: an UPDATE creates a new row version with a new xmin and marks the old one with xmax, and both versions sit in the same table until VACUUM reclaims the dead one. InnoDB, by contrast, overwrites the row in place and pushes the previous values to a separate undo log; rolling back means walking the undo log backward. The practical consequences are large. PostgreSQL's approach makes UPDATE roughly equivalent to DELETE + INSERT, which causes table bloat and forces VACUUM to be a critical part of operations. InnoDB's approach keeps the heap compact but means undo log space can become a bottleneck for long-running transactions. PostgreSQL handles index-only scans differently because indexes do not always know which row versions are visible. InnoDB acquires next-key locks (a row lock plus a gap lock) at REPEATABLE READ to prevent phantoms, while PostgreSQL relies on its snapshot. Both achieve ACID; the engineering trade-offs are simply different.
3. "Explain the role of the WAL in durability. What would go wrong if PostgreSQL wrote table pages to disk before the WAL?"
The WAL is the source of truth for every change. The "write-ahead" rule says the WAL record describing a change must reach durable storage before the modified table page is allowed to. The reason is recovery: if the server crashes, PostgreSQL replays the WAL from the last checkpoint to reconstruct the state of the buffer cache, then rolls back any uncommitted transactions. If a table page were flushed before its WAL record, the server might find a half-modified table on disk after a crash with no log entry explaining how it got there, and recovery could not undo it. By guaranteeing WAL-first, PostgreSQL ensures that the on-disk state is always either consistent with the log or a state the log knows how to fix. At COMMIT, fsyncing the WAL is what makes durability real -- once that fsync returns, the change is on physical media and cannot be lost, even though the actual table page may not be flushed for many seconds afterward.
4. "What is a checkpoint, and how does it interact with WAL and recovery time?"
A checkpoint is a periodic operation in which PostgreSQL writes every dirty shared buffer page out to its table file, fsyncs the data files, and writes a checkpoint record to the WAL. Its purpose is to bound recovery time. After a crash, recovery starts at the last checkpoint and replays every WAL record forward; if checkpoints are infrequent, there can be many gigabytes of WAL to replay and recovery takes minutes or hours. Frequent checkpoints make recovery fast but cost write I/O during normal operation, because dirty pages get flushed sooner and may be re-dirtied repeatedly. The trade-off is tuned by checkpoint_timeout, max_wal_size, and checkpoint_completion_target. A checkpoint does not affect durability -- committed transactions are durable as soon as their WAL record is fsynced, regardless of whether a checkpoint has run -- but it does affect how long it takes to come back up after a crash.
5. "Your service has a multi-step business operation: create an order, decrement stock, charge a payment provider. How do ACID properties apply, and where do they break down?"
Inside a single PostgreSQL transaction, you can wrap the order insert and the stock decrement and get all four ACID properties: either both happen or neither does (atomicity), the foreign key from order to product is enforced (consistency), other sessions only see the new order after commit (isolation), and the change survives a crash (durability). The payment charge is the problem -- it is a network call to an external service that has its own state and is not part of your database's transaction. You cannot atomically commit "row written in DB AND charge succeeded at Stripe." The standard solutions are the outbox pattern (insert a row into an outbox table inside the same DB transaction, then a background worker reads the outbox and calls Stripe with retry and idempotency keys), two-phase commit (rarely worth the complexity), or saga pattern (break the operation into steps with compensating undo actions). The right answer in an interview is to explain that ACID applies cleanly to the database portion of the work, recognize that distributed coordination is a fundamentally different problem, and name a specific pattern (outbox is usually best) for crossing the boundary safely.
Quick Reference -- Cheat Sheet
+---------------------------------------------------------------+
| ACID PROPERTIES MNEMONIC TABLE |
+---------------------------------------------------------------+
| |
| A ATOMICITY "All or nothing" |
| Postgres: WAL + MVCC dead row versions |
| MySQL: redo log + undo log |
| Knob: none -- always on for ACID |
| |
| C CONSISTENCY "Valid state -> valid state" |
| Postgres: CHECK, FK, UNIQUE, triggers |
| MySQL: same |
| Knob: DEFERRABLE constraints |
| |
| I ISOLATION "Concurrent != intertwined" |
| Postgres: MVCC snapshots, SSI for SERIAL. |
| MySQL: MVCC + next-key locks |
| Knob: ISOLATION LEVEL |
| |
| D DURABILITY "COMMIT means forever" |
| Postgres: WAL + fsync + checkpoints |
| MySQL: redo log + doublewrite buffer |
| Knob: synchronous_commit, fsync |
| |
+---------------------------------------------------------------+
+---------------------------------------------------------------+
| KEY MECHANISMS AT A GLANCE |
+---------------------------------------------------------------+
| |
| WAL (Write-Ahead Log) |
| -> Every change written to log BEFORE table file |
| -> fsynced at COMMIT for durability |
| -> Replayed from last checkpoint during recovery |
| |
| MVCC (Multi-Version Concurrency Control) |
| -> UPDATE creates new row version, marks old one dead |
| -> Each session sees rows visible in its snapshot |
| -> Readers never block writers, writers never block reads |
| -> VACUUM reclaims dead row versions |
| |
| CHECKPOINT |
| -> Periodic flush of all dirty buffers to table files |
| -> Bounds the amount of WAL replay needed after a crash |
| -> Triggered by checkpoint_timeout or max_wal_size |
| |
| fsync |
| -> OS call that forces data from page cache to disk |
| -> The single line of code that makes durability work |
| -> Turning it off trades durability for throughput |
| |
+---------------------------------------------------------------+
| Property | Guarantees | Postgres Mechanism | MySQL InnoDB Mechanism | Performance Knob |
|---|---|---|---|---|
| Atomicity | All or nothing | WAL + MVCC | Redo log + undo log | None (always on) |
| Consistency | Valid state -> valid state | Constraints, triggers | Constraints, triggers | DEFERRABLE |
| Isolation | No interference | MVCC snapshots, SSI | MVCC + next-key locks | ISOLATION LEVEL |
| Durability | Survives crash | WAL fsync + checkpoints | Redo log + doublewrite | synchronous_commit |
| Failure Type | Property That Saves You | How |
|---|---|---|
| Constraint violation mid-transaction | Atomicity | Whole transaction rolled back |
| Two clients editing same row | Isolation | MVCC snapshot or row lock |
| Bad foreign key reference | Consistency | INSERT rejected at statement time |
| Power loss after COMMIT | Durability | WAL replay rebuilds state |
| Power loss before COMMIT | Atomicity | Transaction marked aborted on recovery |
| Long read while another session writes | Isolation | Reader sees old MVCC version |
| Application bug writing invalid data | Consistency | Only if you declared a constraint |
Prev: Lesson 11.1 -- What Is a Transaction? Next: Lesson 11.3 -- Dirty Read (READ UNCOMMITTED)
This is Lesson 11.2 of the Database Interview Prep Course -- 12 chapters, 58 lessons.