Database Interview Prep
Sequences and Auto-Increment

AUTO_INCREMENT, SERIAL, IDENTITY

Surrogate Keys Across Engines

LinkedIn Hook

"You typed SERIAL in 2014 and never thought about it again. In 2026 your INSERT ... OVERRIDING SYSTEM VALUE is silently breaking, your dump-and-restore is producing duplicate IDs, and your team lead is asking why the column owner is nobody."

Auto-incrementing primary keys feel like the most boring topic in databases. You declare a column, you forget about it, and the rows get unique IDs. Done. Except every major engine implements them differently, the legacy syntax has subtle ownership and permission bugs, and the "modern" replacement — SQL standard IDENTITY columns — has been quietly recommended by the Postgres core team for almost a decade while half the internet still copy-pastes SERIAL from 2010 tutorials.

MySQL has AUTO_INCREMENT, a column attribute glued onto an integer column with a single hidden counter per table. Postgres has two flavors: the legacy SERIAL macro that secretly creates a sequence and wires it up with a default expression, and the standard GENERATED ... AS IDENTITY syntax introduced in Postgres 10 that finally fixes ownership, permissions, and the "what happens when someone inserts an explicit value" problem.

The differences matter. SERIAL columns are technically integers with a default — meaning you can override them, your sequence can drift out of sync after a bulk load, and pg_dump has had bugs around them for years. IDENTITY columns are first-class citizens of the catalog, support OVERRIDING SYSTEM VALUE and OVERRIDING USER VALUE explicitly, and behave predictably across dumps, restores, and replica promotions. MySQL AUTO_INCREMENT has its own quirks: gaps after rollback, the infamous innodb_autoinc_lock_mode, and replication footguns when statement-based replication meets bulk inserts.

In Lesson 7.2, I break down all three approaches side by side: how each one is implemented under the hood, why IDENTITY is now the recommended default in modern Postgres, where MySQL AUTO_INCREMENT differs from both, and the migration path from SERIAL to IDENTITY without breaking your application.

Read the full lesson -> [link]

#SQL #PostgreSQL #MySQL #Database #BackendDevelopment #InterviewPrep #DataModeling


AUTO_INCREMENT, SERIAL, IDENTITY thumbnail


What You'll Learn

  • How MySQL AUTO_INCREMENT is implemented as a per-table counter inside InnoDB
  • How Postgres SERIAL is just a macro that creates a sequence and a default expression
  • How Postgres GENERATED ... AS IDENTITY (SQL standard) replaces SERIAL cleanly
  • The difference between GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY
  • Why the Postgres core team has recommended IDENTITY over SERIAL since Postgres 10
  • The ownership, permissions, and pg_dump bugs that make SERIAL a footgun
  • How OVERRIDING SYSTEM VALUE and OVERRIDING USER VALUE work at insert time
  • MySQL innodb_autoinc_lock_mode, gaps after rollback, and replication implications
  • A safe migration path from SERIAL to IDENTITY on a live table

The Deli Counter Ticket Dispenser Analogy

Walk into a busy deli. By the door there is a red plastic dispenser with a roll of paper tickets. You pull a tab, the next customer pulls a tab, and the numbers go up forever — 47, 48, 49, 50. The dispenser does not care who you are, what you order, or whether you change your mind and walk out. It hands out the next number, period. If you take ticket 50 and then leave, ticket 50 is gone — the next person gets 51, not 50. That gap is permanent. That is exactly what an auto-incrementing primary key does: it is a monotonic counter that hands out the next integer, regardless of whether the row that got it survives, gets rolled back, or gets deleted later.

Now picture three different deli chains, each with their own dispenser model. The first chain (call it MySQL Deli) bolts the dispenser directly onto the counter — the counter and the dispenser are physically welded together, and if you move the counter, the dispenser comes with it. The second chain (call it Postgres SERIAL Deli) keeps the dispenser as a separate device on a shelf nearby, with a label saying "this dispenser belongs to that counter" — but the label is held on with tape, and sometimes when you redecorate the store, the tape falls off and nobody knows which dispenser belongs to which counter anymore. The third chain (call it Postgres IDENTITY Deli) registers the dispenser in the store's official inventory system as belonging to that specific counter, with a permanent record that survives any redecoration.

All three chains hand out monotonically increasing tickets. All three have the same gaps after rollbacks. The difference is in how the dispenser is attached to the counter — and that attachment is what determines what happens during backup, restore, schema migration, and permission changes. MySQL AUTO_INCREMENT is the welded dispenser. Postgres SERIAL is the taped-on label. Postgres IDENTITY is the registered inventory item. They all do the same job day-to-day. They behave very differently when something goes wrong.

+---------------------------------------------------------------+
|           THREE WAYS TO ATTACH A COUNTER                       |
+---------------------------------------------------------------+
|                                                                |
|  MySQL AUTO_INCREMENT (welded):                                |
|   counter + dispenser are one physical unit                    |
|   -> stored as a table-level attribute in InnoDB               |
|   -> moves with the table, no separate object to manage        |
|   -> one counter per table, period                             |
|                                                                |
|  Postgres SERIAL (taped label):                                |
|   counter is a separate sequence object                        |
|   -> CREATE SEQUENCE foo_id_seq                                |
|   -> column default = nextval('foo_id_seq')                    |
|   -> ownership link is fragile, permissions are separate       |
|                                                                |
|  Postgres IDENTITY (registered inventory):                     |
|   counter is a sequence, but bound to the column in catalog    |
|   -> first-class column property, not a default expression     |
|   -> survives dump/restore, permissions inherit from column    |
|   -> SQL standard syntax, recommended since PG 10              |
|                                                                |
+---------------------------------------------------------------+

Napkin AI Visual Prompt: "Dark navy gradient (#0a0f1f -> #111a2e). Three deli ticket dispensers in a row, each labeled 'MySQL AUTO_INCREMENT', 'SERIAL', 'IDENTITY'. The first is welded to a counter (sky blue weld marks). The second is taped to a counter with rose-colored tape peeling off. The third is shown in an official-looking inventory ledger with a sky blue checkmark. Each dispenser shows a paper ticket with the number '50' visible. White monospace labels. Subtle grid pattern."


MySQL AUTO_INCREMENT — The Welded Counter

In MySQL, AUTO_INCREMENT is a column attribute, not a separate object. You declare a column as INT AUTO_INCREMENT PRIMARY KEY and InnoDB maintains a hidden counter for that table inside its data dictionary. There is exactly one auto-increment column per table (this is a hard limit), and the counter is stored alongside the table metadata.

-- MySQL: AUTO_INCREMENT column
-- One counter, welded to the table, indexed as the primary key.
CREATE TABLE orders (
  id          INT AUTO_INCREMENT PRIMARY KEY,
  customer_id INT NOT NULL,
  total_cents INT NOT NULL,
  created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB;

-- Insert without specifying id -- the counter assigns one
INSERT INTO orders (customer_id, total_cents) VALUES (101, 4500);
INSERT INTO orders (customer_id, total_cents) VALUES (102, 1299);
INSERT INTO orders (customer_id, total_cents) VALUES (103, 8800);

-- Read the rows
SELECT id, customer_id, total_cents FROM orders;
+----+-------------+-------------+
| id | customer_id | total_cents |
+----+-------------+-------------+
|  1 |         101 |        4500 |
|  2 |         102 |        1299 |
|  3 |         103 |        8800 |
+----+-------------+-------------+
3 rows in set

You can also override the counter by inserting an explicit value. MySQL is permissive — it accepts any positive integer and bumps the counter forward if the value is higher than the current next value:

-- Override with an explicit id larger than the current counter
INSERT INTO orders (id, customer_id, total_cents) VALUES (100, 104, 999);

-- The next auto-assigned id will be 101, not 4
INSERT INTO orders (customer_id, total_cents) VALUES (105, 250);

SELECT id, customer_id FROM orders ORDER BY id;
+-----+-------------+
| id  | customer_id |
+-----+-------------+
|   1 |         101 |
|   2 |         102 |
|   3 |         103 |
| 100 |         104 |
| 101 |         105 |
+-----+-------------+
5 rows in set

Notice how the counter jumped from 3 to 101 the moment you inserted the explicit 100. The dispenser is now handing out 101, 102, 103. The gap from 4 to 99 is permanent — those numbers will never be assigned. This is true in every engine and is one of the most common surprises for engineers who expect IDs to be "dense."

Gaps After Rollback

InnoDB allocates the next auto-increment value before the row is fully inserted, and it does not return the value if the transaction rolls back. This means:

-- Transaction A
START TRANSACTION;
INSERT INTO orders (customer_id, total_cents) VALUES (200, 100);
-- This row gets id = 102 (next after 101)
ROLLBACK;
-- The row is gone, but id 102 is consumed forever

-- Transaction B
INSERT INTO orders (customer_id, total_cents) VALUES (201, 200);
-- This row gets id = 103
+-----+-------------+
| id  | customer_id |
+-----+-------------+
|   1 |         101 |
|   2 |         102 |
|   3 |         103 |
| 100 |         104 |
| 101 |         105 |
| 103 |         201 |
+-----+-------------+
6 rows in set

Treat auto-increment IDs as opaque unique identifiers, never as a count of rows, never as a contiguous range, and never as a reliable ordering of insert time across concurrent transactions.

innodb_autoinc_lock_mode

InnoDB has a setting called innodb_autoinc_lock_mode that controls how aggressively auto-increment values are reserved during inserts. The default in MySQL 8 is 2 ("interleaved"), which gives the best concurrency but means bulk inserts may produce non-contiguous ID ranges. Older modes (0 traditional, 1 consecutive) take a table-level lock for the duration of the insert, which kills throughput on hot tables but produces dense ID ranges per statement. If you ever care about "all rows from one batch insert have consecutive IDs," you need to know about this setting — and you almost certainly should not care.


Postgres SERIAL — The Taped-On Sequence

Now the same table in Postgres using the legacy SERIAL type. SERIAL is not a real type. It is a macro that the parser expands into three separate things: a sequence object, a column with integer type, and a default expression that calls nextval() on that sequence.

-- Postgres: SERIAL is a legacy macro
CREATE TABLE orders (
  id          SERIAL PRIMARY KEY,
  customer_id INT NOT NULL,
  total_cents INT NOT NULL,
  created_at  TIMESTAMPTZ DEFAULT now()
);

What Postgres actually executes under the hood is roughly:

-- What SERIAL really expands to (do not run -- this is the expansion)
CREATE SEQUENCE orders_id_seq;

CREATE TABLE orders (
  id          INTEGER NOT NULL DEFAULT nextval('orders_id_seq'),
  customer_id INT NOT NULL,
  total_cents INT NOT NULL,
  created_at  TIMESTAMPTZ DEFAULT now(),
  PRIMARY KEY (id)
);

-- Mark the sequence as "owned by" the column so DROP TABLE drops it too
ALTER SEQUENCE orders_id_seq OWNED BY orders.id;

For everyday inserts this works exactly like MySQL AUTO_INCREMENT:

INSERT INTO orders (customer_id, total_cents) VALUES (101, 4500);
INSERT INTO orders (customer_id, total_cents) VALUES (102, 1299);
INSERT INTO orders (customer_id, total_cents) VALUES (103, 8800);

SELECT id, customer_id, total_cents FROM orders;
 id | customer_id | total_cents
----+-------------+-------------
  1 |         101 |        4500
  2 |         102 |        1299
  3 |         103 |        8800
(3 rows)

The Problems With SERIAL

Because SERIAL is "just" a default expression on an integer column, several things go wrong that users do not expect:

  1. Permissions are separate. Granting INSERT on the table does not grant USAGE on the underlying sequence. A user with insert privileges on orders will get a permission error from nextval('orders_id_seq') unless you also grant sequence usage. This bites every team that uses fine-grained role-based access.

  2. Sequence drift after bulk inserts. If you load data with explicit IDs (COPY, pg_dump restore, ETL job), the sequence is not advanced. Your next INSERT without an explicit ID then collides with an existing row. Every Postgres team has hit this; the fix is SELECT setval('orders_id_seq', (SELECT max(id) FROM orders)) after the load.

  3. Ownership is fragile. The OWNED BY link can be broken by certain ALTER operations, by careless dump/restore scripts, or by manually creating the column without the macro. When the link is broken, dropping the table leaves the sequence orphaned in the database.

  4. The column type is integer, not SERIAL. When you query information_schema.columns, the type comes back as integer with a default. There is no schema-level signal that this column is supposed to auto-increment, which makes ORMs guess and migration tools generate confusing diffs.

These are not theoretical complaints. They are the reasons the Postgres core team added a better option in version 10.


Postgres IDENTITY — The Registered, Standard-Compliant Way

Postgres 10 (released October 2017) introduced the SQL standard GENERATED ... AS IDENTITY syntax. An identity column is a first-class column property: the catalog records that the column is an identity, the sequence is created and owned correctly, and permissions follow the column instead of being a separate sequence grant.

-- Postgres: GENERATED ... AS IDENTITY (SQL standard, recommended)
CREATE TABLE orders (
  id          INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  customer_id INT NOT NULL,
  total_cents INT NOT NULL,
  created_at  TIMESTAMPTZ DEFAULT now()
);

-- Inserts work the same way as SERIAL
INSERT INTO orders (customer_id, total_cents) VALUES (101, 4500);
INSERT INTO orders (customer_id, total_cents) VALUES (102, 1299);
INSERT INTO orders (customer_id, total_cents) VALUES (103, 8800);

SELECT id, customer_id, total_cents FROM orders;
 id | customer_id | total_cents
----+-------------+-------------
  1 |         101 |        4500
  2 |         102 |        1299
  3 |         103 |        8800
(3 rows)

ALWAYS vs BY DEFAULT

Identity columns come in two flavors, and the difference is important:

-- ALWAYS: the database refuses explicit values unless you opt in
CREATE TABLE strict_orders (
  id   INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name TEXT
);

-- This FAILS:
INSERT INTO strict_orders (id, name) VALUES (99, 'manual');
-- ERROR: cannot insert a non-DEFAULT value into column "id"
-- DETAIL: Column "id" is an identity column defined as GENERATED ALWAYS.
-- HINT: Use OVERRIDING SYSTEM VALUE to override.

-- This succeeds (explicit opt-in):
INSERT INTO strict_orders (id, name)
OVERRIDING SYSTEM VALUE
VALUES (99, 'manual');
-- BY DEFAULT: the database accepts explicit values silently (like SERIAL)
CREATE TABLE flexible_orders (
  id   INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  name TEXT
);

-- Both of these work:
INSERT INTO flexible_orders (name)             VALUES ('auto');     -- id = 1
INSERT INTO flexible_orders (id, name)         VALUES (99, 'set');  -- id = 99

SELECT id, name FROM flexible_orders ORDER BY id;
 id | name
----+------
  1 | auto
 99 | set
(2 rows)
+---------------------------------------------------------------+
|           ALWAYS vs BY DEFAULT — WHEN TO USE WHICH            |
+---------------------------------------------------------------+
|                                                                |
|  GENERATED ALWAYS AS IDENTITY:                                 |
|   -> Database rejects explicit inserts unless OVERRIDING       |
|   -> Use this for true surrogate keys you NEVER want set       |
|   -> Catches application bugs that try to set the PK manually  |
|   -> Recommended default for new tables                        |
|                                                                |
|  GENERATED BY DEFAULT AS IDENTITY:                             |
|   -> Database accepts explicit values (like SERIAL did)        |
|   -> Use this when you need to migrate data with set IDs       |
|   -> Use this for ORMs that insist on round-tripping IDs       |
|   -> Behaves identically to SERIAL for application code        |
|                                                                |
|  RULE OF THUMB: prefer ALWAYS, fall back to BY DEFAULT only    |
|  when you have a concrete reason (data migration, ORM quirk).  |
|                                                                |
+---------------------------------------------------------------+

Why IDENTITY Is Better Than SERIAL

Every problem listed in the SERIAL section is fixed:

  • Permissions: the underlying sequence is not user-visible the same way; INSERT on the table is enough. No separate sequence grant is needed.
  • Sequence drift: still possible if you bulk-load with explicit IDs, but you can reset with ALTER TABLE orders ALTER COLUMN id RESTART WITH ... — a column-level command, not a separate sequence command.
  • Ownership: the sequence is bound to the column at the catalog level. pg_dump and pg_restore handle it correctly. Dropping the column or table cleans up the sequence atomically.
  • Schema introspection: information_schema.columns.is_identity is 'YES'. Tools and ORMs can detect identity columns reliably.
  • SQL standard: the same syntax works in Postgres, Oracle, DB2, and SQL Server. It is portable in a way that SERIAL and AUTO_INCREMENT will never be.

The Postgres docs themselves now say (paraphrasing): "The data types smallserial, serial, and bigserial are not true types. If you wish for new applications to be portable, you should use the standard SQL-conforming GENERATED ... AS IDENTITY syntax."


Side-by-Side Example — Same Table, Three Engines

-- ============================================================
-- MySQL
-- ============================================================
CREATE TABLE products (
  id    INT AUTO_INCREMENT PRIMARY KEY,
  name  VARCHAR(100) NOT NULL,
  price DECIMAL(10,2) NOT NULL
) ENGINE=InnoDB;

INSERT INTO products (name, price) VALUES
  ('Mouse',    19.99),
  ('Keyboard', 49.99),
  ('Monitor', 299.00);

SELECT * FROM products;
+----+----------+--------+
| id | name     | price  |
+----+----------+--------+
|  1 | Mouse    |  19.99 |
|  2 | Keyboard |  49.99 |
|  3 | Monitor  | 299.00 |
+----+----------+--------+
-- ============================================================
-- Postgres (legacy SERIAL -- avoid for new code)
-- ============================================================
CREATE TABLE products (
  id    SERIAL PRIMARY KEY,
  name  TEXT NOT NULL,
  price NUMERIC(10,2) NOT NULL
);

INSERT INTO products (name, price) VALUES
  ('Mouse',    19.99),
  ('Keyboard', 49.99),
  ('Monitor', 299.00);

SELECT * FROM products;
 id |   name   | price
----+----------+--------
  1 | Mouse    |  19.99
  2 | Keyboard |  49.99
  3 | Monitor  | 299.00
(3 rows)
-- ============================================================
-- Postgres (modern IDENTITY -- recommended)
-- ============================================================
CREATE TABLE products (
  id    INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name  TEXT NOT NULL,
  price NUMERIC(10,2) NOT NULL
);

INSERT INTO products (name, price) VALUES
  ('Mouse',    19.99),
  ('Keyboard', 49.99),
  ('Monitor', 299.00);

SELECT * FROM products;
 id |   name   | price
----+----------+--------
  1 | Mouse    |  19.99
  2 | Keyboard |  49.99
  3 | Monitor  | 299.00
(3 rows)

The day-to-day behavior is identical. The differences only matter when something unusual happens: a permission grant, a dump/restore, a bulk load, an ORM-driven schema migration, or a port to a different engine.

Napkin AI Visual Prompt: "Dark navy gradient (#0a0f1f -> #111a2e). Three SQL code blocks side by side in white monospace, each headed with a label: 'MySQL AUTO_INCREMENT', 'Postgres SERIAL (legacy)', 'Postgres IDENTITY (modern)'. All three produce the same output table on the right with three rows. The IDENTITY block has a sky blue (#4fc3f7) glow and a small 'recommended' badge. The SERIAL block has a faded rose (#ff5c8a) 'legacy' tag. Subtle grid pattern."


Migrating SERIAL to IDENTITY on a Live Table

If you have an existing Postgres database full of SERIAL columns, you can convert them in place without downtime. The key insight is that SERIAL and IDENTITY both use a sequence under the hood — you just need to detach the sequence from being a default expression and re-attach it as an identity property.

-- Step 1: drop the default that calls nextval()
ALTER TABLE orders ALTER COLUMN id DROP DEFAULT;

-- Step 2: drop the OWNED BY link so the sequence is not auto-dropped
ALTER SEQUENCE orders_id_seq OWNED BY NONE;

-- Step 3: convert the column to IDENTITY using the existing sequence's value
-- First, capture the current sequence value
SELECT last_value FROM orders_id_seq;
-- Suppose this returned 4521

-- Step 4: drop the old sequence
DROP SEQUENCE orders_id_seq;

-- Step 5: add the identity property, starting at the captured value + 1
ALTER TABLE orders
  ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (RESTART WITH 4522);

-- Verify
SELECT column_name, is_identity, identity_generation
FROM information_schema.columns
WHERE table_name = 'orders' AND column_name = 'id';
 column_name | is_identity | identity_generation
-------------+-------------+---------------------
 id          | YES         | BY DEFAULT
(1 row)

Use BY DEFAULT (not ALWAYS) for the migration so that any application code currently inserting explicit IDs keeps working. Once you have audited the application and confirmed nothing inserts explicit IDs, you can switch to ALWAYS:

ALTER TABLE orders ALTER COLUMN id SET GENERATED ALWAYS;

Common Mistakes

1. Treating auto-incrementing IDs as a row count or contiguous range. The IDs are guaranteed unique and roughly monotonic, but they are not dense. Rollbacks, deletes, bulk inserts with explicit IDs, and concurrent transactions all create gaps. If you need to know "how many rows," use COUNT(*). If you need a dense sequence, generate it at query time with ROW_NUMBER(). Never write code that assumes ID N exists or that IDs are contiguous.

2. Using SERIAL in new Postgres code. Every new Postgres table written in 2020 or later should use GENERATED ... AS IDENTITY. The Postgres docs explicitly recommend it. The only reason SERIAL still exists is backwards compatibility with code written before Postgres 10. If you are starting fresh, there is no upside to picking the legacy macro and several downsides (permission bugs, dump issues, ORM confusion).

3. Granting INSERT on a SERIAL table without granting sequence usage. A classic Postgres permissions trap. GRANT INSERT ON orders TO app_user is not enough — the user also needs GRANT USAGE ON SEQUENCE orders_id_seq TO app_user to actually call nextval(). With IDENTITY, this is no longer required because the sequence is bound to the column. This is one of the most common reasons teams switch to IDENTITY.

4. Bulk-loading data without resetting the sequence afterward. If you COPY rows into a table with explicit IDs (or restore from a dump that includes IDs), the sequence is not advanced. The next INSERT without an explicit ID then tries to use a value that already exists, and you get a unique-constraint violation. Always run SELECT setval('seq', (SELECT max(id) FROM tbl)) (for SERIAL) or ALTER TABLE ... ALTER COLUMN id RESTART WITH ... (for IDENTITY) after bulk loads.

5. Using AUTO_INCREMENT columns as sortable timestamps. Tempting, because IDs roughly correlate with insert order. But under concurrent inserts and high innodb_autoinc_lock_mode values, IDs can be reordered relative to commit time. Two transactions can grab IDs 100 and 101 in one order and commit in the opposite order. If you need a reliable insert ordering, use a created_at timestamp column (with a tiebreaker) — never the surrogate key alone.


Interview Questions

1. "What is the difference between SERIAL and GENERATED ALWAYS AS IDENTITY in PostgreSQL, and which should you use?"

SERIAL is a legacy macro that expands into three things: a sequence object, an integer column, and a default expression nextval('seq_name'). The column type is actually integer, and the relationship to the sequence is just an OWNED BY link plus the default. GENERATED ALWAYS AS IDENTITY is the SQL standard syntax introduced in Postgres 10. It records the column as an identity in the catalog, the sequence is bound to the column as a first-class property, and permissions on the table flow through to the sequence automatically. IDENTITY should be the default for all new code: it fixes the permissions trap where INSERT on the table does not grant USAGE on the sequence, it survives pg_dump/pg_restore more cleanly, it lets you reject explicit ID inserts via GENERATED ALWAYS, and it is portable across SQL standard-compliant databases like Oracle and SQL Server.

2. "Why are there gaps in auto-incrementing IDs after a transaction rollback?"

Both InnoDB and Postgres allocate the next sequence value before the row is fully committed, and they do not return the value if the transaction rolls back. The reason is concurrency: if the engine had to wait until commit to assign the ID, it would have to hold a lock on the counter for the duration of every transaction, which would serialize every insert in the database. By allocating the value up front and never returning it, the engine lets concurrent transactions grab IDs independently, even if some of them roll back. The price is gaps in the ID range. This is why you should treat IDs as opaque unique identifiers, never as a count of rows or a dense range. If you genuinely need contiguous numbering, you have to compute it at query time with ROW_NUMBER(), accepting that it can change as rows are deleted.

3. "How does GENERATED ALWAYS AS IDENTITY differ from GENERATED BY DEFAULT AS IDENTITY, and when would you use each?"

GENERATED ALWAYS makes Postgres reject any INSERT that supplies an explicit value for the column — you get an error unless you opt in with the OVERRIDING SYSTEM VALUE clause. GENERATED BY DEFAULT accepts explicit values silently, behaving the same as SERIAL and MySQL AUTO_INCREMENT. Use ALWAYS for true surrogate keys that the application should never set manually: it catches bugs where ORM mappings or test fixtures accidentally try to insert an ID, and it forces an explicit, reviewable opt-in for legitimate cases like data migration. Use BY DEFAULT when you need flexibility — for example, when migrating from SERIAL (where existing application code already inserts explicit IDs in some paths), when round-tripping data through tools that preserve IDs, or when an ORM insists on echoing the ID back during inserts. The recommended default for new tables is ALWAYS, with BY DEFAULT as the fallback when you have a specific reason.

4. "What is innodb_autoinc_lock_mode in MySQL and why does it matter?"

innodb_autoinc_lock_mode controls how InnoDB allocates auto-increment values during inserts. Mode 0 (traditional) takes a table-level auto-inc lock for the entire statement, producing dense consecutive IDs per statement but serializing all inserts on the table — terrible for concurrency. Mode 1 (consecutive, the pre-MySQL-8 default) takes the lock only for "simple" inserts where the row count is known in advance, and uses a lighter mechanism for bulk inserts where it cannot. Mode 2 (interleaved, the MySQL 8 default) never takes the table-level lock and lets concurrent inserts interleave their auto-increment values — best concurrency but bulk inserts may produce non-contiguous ID ranges, and statement-based replication can produce different IDs on the replica. The setting matters because it is a tradeoff between insert throughput and ID density, and because the change of default from 1 to 2 in MySQL 8 broke applications that assumed bulk-inserted rows always had consecutive IDs.

5. "Describe how you would migrate a SERIAL column to an IDENTITY column on a live PostgreSQL table without downtime."

The migration is purely a metadata change — no rows are touched, so it can be done with a brief lock on the table. First, drop the column default that calls nextval(): ALTER TABLE orders ALTER COLUMN id DROP DEFAULT. Second, detach the existing sequence from being owned by the column with ALTER SEQUENCE orders_id_seq OWNED BY NONE. Third, capture the sequence's current value with SELECT last_value FROM orders_id_seq. Fourth, drop the old sequence. Fifth, attach the identity property to the column with ALTER TABLE orders ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (RESTART WITH <captured_value + 1>). Use BY DEFAULT (not ALWAYS) for the migration so existing application code that inserts explicit IDs keeps working. Once you have verified that nothing inserts explicit IDs, you can tighten it with ALTER TABLE orders ALTER COLUMN id SET GENERATED ALWAYS. The whole sequence is fast because it only updates catalog entries, and you can wrap it in a transaction for atomicity.


Quick Reference — Cheat Sheet

+---------------------------------------------------------------+
|           AUTO-INCREMENT CHEAT SHEET                          |
+---------------------------------------------------------------+
|                                                                |
|  MYSQL:                                                        |
|   id INT AUTO_INCREMENT PRIMARY KEY                            |
|   -- one per table, stored in InnoDB data dictionary           |
|                                                                |
|  POSTGRES (legacy):                                            |
|   id SERIAL PRIMARY KEY            -- 32-bit                  |
|   id BIGSERIAL PRIMARY KEY         -- 64-bit                  |
|   id SMALLSERIAL PRIMARY KEY       -- 16-bit                  |
|                                                                |
|  POSTGRES (modern, recommended):                               |
|   id INT  GENERATED ALWAYS AS IDENTITY PRIMARY KEY             |
|   id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY           |
|   id INT  GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY         |
|                                                                |
|  OVERRIDE INSERTS (IDENTITY):                                  |
|   INSERT INTO t (id, ...) OVERRIDING SYSTEM VALUE VALUES (...) |
|   INSERT INTO t (id, ...) OVERRIDING USER VALUE   VALUES (...) |
|                                                                |
|  RESET COUNTER:                                                |
|   MySQL:    ALTER TABLE t AUTO_INCREMENT = 1000;               |
|   SERIAL:   SELECT setval('t_id_seq', 999);                    |
|   IDENTITY: ALTER TABLE t ALTER COLUMN id RESTART WITH 1000;   |
|                                                                |
|  AFTER BULK LOAD (sync sequence to max id):                    |
|   MySQL:    automatic                                          |
|   SERIAL:   SELECT setval('t_id_seq',(SELECT max(id) FROM t)); |
|   IDENTITY: ALTER TABLE t ALTER COLUMN id RESTART WITH <max+1>;|
|                                                                |
+---------------------------------------------------------------+
FeatureMySQL AUTO_INCREMENTPostgres SERIALPostgres IDENTITY
SQL standardNoNoYes
Underlying mechanismInnoDB table counterSeparate sequence + defaultSequence bound to column in catalog
Type storedINT (or BIGINT)integer (with default)integer (with identity property)
Reject explicit insertsNoNoYes (GENERATED ALWAYS)
Allow explicit insertsYes (always)YesYes (GENERATED BY DEFAULT or OVERRIDING SYSTEM VALUE)
Permissions on counterInherited from tableSeparate USAGE on sequenceInherited from table
pg_dump / pg_restoreN/AHas had bugsClean and reliable
Schema introspectionEXTRA = auto_incrementType is integer (must inspect default)is_identity = YES
Recommended for new codeYes (only option in MySQL)NoYes
Gaps after rollbackYesYesYes
Available sinceAlwaysAlwaysPostgres 10 (2017)

Prev: Lesson 7.1 -- Sequences Next: Lesson 7.3 -- UUID Primary Keys


This is Lesson 7.2 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

On this page