Database Interview Prep
Database Fundamentals

Schema Design Basics

DDL Commands Every Dev Should Know

LinkedIn Hook

"A junior dev ran DROP TABLE users on production instead of staging. No IF EXISTS, no CASCADE warning, no backup. Just an empty users table and a very long Sunday."

Schema design is the most underestimated skill in backend engineering. Developers spend weeks optimizing queries but treat CREATE TABLE like a throwaway chore — copy-paste from the last project, rename a few columns, ship it. Then six months later the orders table has a column called user (a reserved-ish word), no foreign key to users, a typo in statsus, and nobody knows whether it is safe to drop.

DDL — Data Definition Language — is the foundation every query stands on. CREATE DATABASE, CREATE TABLE, ALTER TABLE, DROP TABLE. Four commands, countless ways to hurt yourself. The difference between a senior and a junior is not knowing the syntax — it is knowing what CASCADE actually does, why IF EXISTS belongs in every idempotent script, and why naming a column order is a landmine that will explode three years from now.

The good news is that the rules are simple once you see them. Use snake_case. Never skip migrations. Read every ALTER TABLE twice before running it in production. Wrap destructive changes in transactions. And never, ever, trust a DROP without a backup.

In Lesson 1.4, I break down the DDL commands every backend developer must know — CREATE, ALTER, DROP, the CASCADE rules, and the naming conventions that keep your schema readable five years from now.

Read the full lesson -> [link]

#Database #PostgreSQL #SQL #BackendDevelopment #SchemaDesign #InterviewPrep


Schema Design Basics thumbnail


What You'll Learn

  • The difference between DDL and DML and why the distinction matters
  • CREATE DATABASE and how PostgreSQL schemas differ from MySQL databases
  • The search_path in PostgreSQL and why public is not sacred
  • CREATE TABLE with all the common parts — columns, types, constraints, defaults
  • ALTER TABLE for adding, dropping, renaming, and changing column types safely
  • DROP TABLE IF EXISTS and why idempotent scripts are a professional baseline
  • CASCADE behavior for drops and the subtle ways it bites you
  • Naming conventions — snake_case, plural vs singular, and reserved words to avoid

The Blueprint Analogy — Why DDL Is Architecture, Not Typing

Imagine an architect designing a skyscraper. The blueprint defines where every wall stands, where the plumbing runs, where the load-bearing beams sit. Once the concrete is poured, moving a wall costs a hundred times what it cost to draw one on paper. The blueprint is not the building, but it determines what the building can ever become. A mistake in the blueprint is cheap to fix on Monday morning and catastrophic to fix after the tenants move in.

Your database schema is the blueprint. CREATE TABLE is pouring concrete. ALTER TABLE is knocking down a wall while people are still living in the apartment. DROP TABLE is dynamite. The queries your application runs every second are the tenants — and they have strong opinions about whether their walls are still standing.

DDL commands are the architect's pen. DML commands (SELECT, INSERT, UPDATE, DELETE) are the tenants going about their day. The two languages operate at different levels of the stack, and confusing them is how production incidents are born.

+---------------------------------------------------------------+
|           DDL vs DML — TWO DIFFERENT LANGUAGES                |
+---------------------------------------------------------------+
|                                                                |
|  DDL — Data Definition Language                               |
|   - CREATE, ALTER, DROP, TRUNCATE, RENAME                     |
|   - Changes the SHAPE of the database                         |
|   - Usually auto-commits (cannot roll back in MySQL)          |
|   - Rare, reviewed, and dangerous                             |
|   - Example: CREATE TABLE users (...)                         |
|                                                                |
|  DML — Data Manipulation Language                             |
|   - SELECT, INSERT, UPDATE, DELETE, MERGE                     |
|   - Changes the CONTENTS of existing tables                   |
|   - Always transactional, can roll back                       |
|   - Frequent, automated, everyday                             |
|   - Example: INSERT INTO users (email) VALUES ('a@b.c')       |
|                                                                |
|  THE MENTAL MODEL:                                             |
|   DDL is the architect with a hammer.                         |
|   DML is the tenant moving furniture.                         |
|                                                                |
+---------------------------------------------------------------+

In PostgreSQL, most DDL statements are transactional — you can wrap a CREATE TABLE and an ALTER TABLE inside BEGIN / COMMIT and roll them back together. In MySQL, most DDL auto-commits, which means a failed migration leaves you stranded halfway. This is one of the quiet reasons many teams prefer Postgres for anything schema-heavy.


CREATE DATABASE — The Top-Level Container

A database is the top-level namespace in a PostgreSQL or MySQL server. One PostgreSQL cluster can host many databases, and a connection opens against exactly one of them. You cannot issue a cross-database query in a single statement in PostgreSQL (unless you reach for foreign data wrappers or dblink); each database is its own island.

-- Create a new database with a specific owner and encoding.
-- UTF8 is the only sane default in 2026 — anything else will haunt you.
CREATE DATABASE shop_api
  WITH OWNER = shop_admin
       ENCODING = 'UTF8'
       LC_COLLATE = 'en_US.UTF-8'
       LC_CTYPE = 'en_US.UTF-8'
       TEMPLATE = template0;

-- Drop a database — only works if no connections are active.
-- Terminate existing sessions first if needed.
DROP DATABASE IF EXISTS shop_api;

MySQL note: in MySQL the keyword DATABASE and SCHEMA are synonyms. CREATE DATABASE shop_api and CREATE SCHEMA shop_api do the same thing. In PostgreSQL they are different concepts — databases are top-level, schemas are namespaces inside a database. This is the single most common source of confusion when moving between the two.


Schemas in PostgreSQL — The search_path and the public Schema

Inside a PostgreSQL database, a schema is a namespace that holds tables, views, functions, and types. When you write SELECT * FROM users, Postgres looks up users against a list of schemas called the search_path. By default the path is "$user", public, which means Postgres looks for a schema named after the current user first, and then falls back to the public schema.

-- Create a schema to group related tables
CREATE SCHEMA IF NOT EXISTS billing;

-- Create a table inside that schema explicitly
CREATE TABLE billing.invoices (
  id BIGSERIAL PRIMARY KEY,
  amount_cents BIGINT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Query it with the fully qualified name
SELECT * FROM billing.invoices;

-- Or change the search_path for the session to avoid the prefix
SET search_path TO billing, public;

SELECT * FROM invoices; -- now resolves to billing.invoices

Most small apps dump everything into the public schema and never think about it again. That is fine until you need multi-tenant isolation, logical separation between modules, or a clean place to install an extension like postgis. A common pattern in larger apps is to give each bounded context its own schema — billing, identity, inventory — and keep public for shared utilities.

+---------------------------------------------------------------+
|           POSTGRES NAMESPACE HIERARCHY                        |
+---------------------------------------------------------------+
|                                                                |
|  CLUSTER (one running postgres process)                        |
|    |                                                           |
|    +-- DATABASE: shop_api                                      |
|    |     |                                                     |
|    |     +-- SCHEMA: public                                    |
|    |     |     +-- TABLE: users                                |
|    |     |     +-- TABLE: orders                               |
|    |     |                                                     |
|    |     +-- SCHEMA: billing                                   |
|    |     |     +-- TABLE: invoices                             |
|    |     |     +-- TABLE: payments                             |
|    |     |                                                     |
|    |     +-- SCHEMA: audit                                     |
|    |           +-- TABLE: change_log                           |
|    |                                                           |
|    +-- DATABASE: analytics                                     |
|          +-- SCHEMA: public                                    |
|                +-- TABLE: events                               |
|                                                                |
+---------------------------------------------------------------+

Security tip: in modern Postgres (15+), the public schema is no longer writable by every user by default. If you are on an older version, run REVOKE CREATE ON SCHEMA public FROM PUBLIC; to avoid the classic "any user can create tables anywhere" footgun.


CREATE TABLE — The Workhorse of DDL

CREATE TABLE is where the real schema design happens. A table definition has columns (with types), constraints (that enforce rules), defaults, and optional storage hints. This is the command you will write thousands of times in your career — get comfortable with every part of it.

-- A realistic users table showing every common feature.
-- Comments explain what each piece does and why.
CREATE TABLE IF NOT EXISTS users (
  -- Surrogate primary key. BIGSERIAL = BIGINT + auto-increment sequence.
  -- Use BIGINT from day one — INT runs out at 2.1 billion rows.
  id            BIGSERIAL PRIMARY KEY,

  -- Natural unique identifier. CITEXT would make it case-insensitive,
  -- but plain TEXT with a lower() index is more portable.
  email         TEXT NOT NULL UNIQUE,

  -- Fixed-length is almost never what you want in Postgres — use TEXT.
  -- VARCHAR(n) exists for SQL-standard compatibility but offers no perf win.
  full_name     TEXT NOT NULL,

  -- Boolean with explicit default so inserts do not need to set it.
  is_active     BOOLEAN NOT NULL DEFAULT TRUE,

  -- Enum-style column using a CHECK constraint. A real ENUM type is
  -- another option but harder to alter later.
  role          TEXT NOT NULL DEFAULT 'member'
                CHECK (role IN ('member', 'admin', 'owner')),

  -- Timestamps — ALWAYS use TIMESTAMPTZ, never TIMESTAMP.
  -- TIMESTAMPTZ stores UTC and converts on read; TIMESTAMP loses tz info.
  created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),

  -- Soft-delete marker. NULL = active, non-null = deleted at this time.
  deleted_at    TIMESTAMPTZ,

  -- Table-level constraint: full_name cannot be blank after trimming.
  CONSTRAINT full_name_not_blank CHECK (length(btrim(full_name)) > 0)
);

-- An index on the lower-cased email for case-insensitive lookups.
-- The UNIQUE constraint above already indexes the exact value.
CREATE INDEX idx_users_email_lower ON users (lower(email));

-- An index on created_at for time-range queries.
CREATE INDEX idx_users_created_at ON users (created_at DESC);

A few rules worth tattooing on your forearm:

  • Always use BIGINT / BIGSERIAL / BIGINT GENERATED ALWAYS AS IDENTITY for primary keys. Four bytes saved per row is not worth the migration pain when you hit 2 billion.
  • Always use TIMESTAMPTZ, never TIMESTAMP. TIMESTAMP WITHOUT TIME ZONE is a bug waiting for a user in a different country.
  • Always use TEXT instead of VARCHAR(n) in Postgres. Unlike SQL Server or older Oracle, Postgres has no performance difference, and TEXT does not lock you into a length.
  • Always name your constraints. CONSTRAINT users_email_key is findable; an auto-generated name is not.

MySQL note: MySQL's equivalent is BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY. Use utf8mb4 as the character set — the older utf8 in MySQL is a three-byte alias that cannot store emojis or many CJK characters. This has bitten more teams than almost any other gotcha.


ALTER TABLE — Evolving the Schema Safely

Schemas are never done. Business requirements shift, columns get added, old columns get renamed. ALTER TABLE is the tool for every such change, and it is also the most dangerous command you will run regularly in production. On a large table, adding a column with a default can rewrite every row — minutes or hours of table-level locks.

-- Add a column. In Postgres 11+, a non-volatile default is instant
-- (stored as metadata). A volatile default still rewrites the table.
ALTER TABLE users
  ADD COLUMN phone TEXT;

-- Add a column with a default — fast in modern Postgres.
ALTER TABLE users
  ADD COLUMN login_count INT NOT NULL DEFAULT 0;

-- Drop a column. The data is gone — irrecoverable without a backup.
-- In Postgres, the column is marked dropped and physically removed
-- on the next table rewrite (VACUUM FULL or pg_repack).
ALTER TABLE users
  DROP COLUMN IF EXISTS phone;

-- Rename a column. The catalog entry changes instantly.
-- Beware: any view, function, or application code referencing the
-- old name will break immediately.
ALTER TABLE users
  RENAME COLUMN full_name TO display_name;

-- Change a column's type. This can rewrite every row if the cast is
-- not binary-compatible. Always test on a copy of the real data first.
ALTER TABLE users
  ALTER COLUMN login_count TYPE BIGINT USING login_count::BIGINT;

-- Add a NOT NULL constraint. On a large table, back-fill the column
-- in batches first, then add NOT NULL as a fast metadata-only change.
ALTER TABLE users
  ALTER COLUMN display_name SET NOT NULL;

-- Add a foreign key. In Postgres, you can add it as NOT VALID first,
-- then VALIDATE CONSTRAINT, to avoid a long AccessExclusiveLock.
ALTER TABLE orders
  ADD CONSTRAINT orders_user_id_fkey
  FOREIGN KEY (user_id) REFERENCES users (id)
  NOT VALID;

ALTER TABLE orders
  VALIDATE CONSTRAINT orders_user_id_fkey;

-- Rename a table. Fast, metadata-only, but breaks every query that
-- referenced the old name.
ALTER TABLE users RENAME TO accounts;

The golden rule: on a table bigger than a few thousand rows, assume every ALTER needs a plan. Read the Postgres docs for the specific operation. Check whether it takes an AccessExclusiveLock. Run it against a copy first. Consider tools like pg_repack or gh-ost (for MySQL) when the naive approach would block writes for too long.

+---------------------------------------------------------------+
|           ALTER TABLE — SAFE vs DANGEROUS OPERATIONS          |
+---------------------------------------------------------------+
|                                                                |
|  FAST / METADATA-ONLY (Postgres 11+):                          |
|   - ADD COLUMN with constant default                           |
|   - ADD COLUMN without default                                 |
|   - RENAME COLUMN / RENAME TABLE                               |
|   - DROP COLUMN (logical only)                                 |
|   - ADD CONSTRAINT ... NOT VALID                               |
|                                                                |
|  SLOW / REWRITES THE TABLE:                                    |
|   - ALTER COLUMN TYPE (non-binary-compatible cast)             |
|   - ADD COLUMN with volatile default                           |
|   - SET DEFAULT + UPDATE existing rows                         |
|                                                                |
|  LOCK-HEAVY / BLOCKS READS AND WRITES:                         |
|   - Adding FK without NOT VALID                                |
|   - Adding NOT NULL without prior backfill                     |
|   - Any op that takes AccessExclusiveLock on a big table       |
|                                                                |
+---------------------------------------------------------------+

DROP TABLE, IF EXISTS, and the CASCADE Footgun

DROP TABLE deletes a table and all of its data. There is no undo. The command is short, the consequences are total, and the guardrails are two keywords: IF EXISTS and CASCADE.

-- The idempotent form. Running this twice is safe — the second call
-- is a no-op instead of an error. Every migration script should use
-- IF EXISTS on drops so reruns and partial rollbacks do not fail.
DROP TABLE IF EXISTS sessions;

-- The CASCADE form. This drops the table AND every object that
-- depends on it — foreign keys, views, materialized views, etc.
-- READ THAT AGAIN. It will drop dependent objects too.
DROP TABLE IF EXISTS users CASCADE;

-- The safe form. RESTRICT is the default and refuses to drop if
-- anything depends on the table. Always prefer this in production
-- unless you have audited every dependency.
DROP TABLE IF EXISTS users RESTRICT;

CASCADE is seductive because it makes the error go away. You run DROP TABLE users, Postgres tells you "cannot drop because orders depends on it," and the temptation is to slap CASCADE on the end. Do not. That single keyword can silently drop the foreign key on orders, a view called active_users, a function that joined the table, and three materialized views the analytics team built last quarter. Always check dependencies first.

-- Find everything that depends on a table before dropping it.
-- Run this and read the output carefully.
SELECT dependent_ns.nspname AS dependent_schema,
       dependent_view.relname AS dependent_object,
       dependent_view.relkind AS object_kind
FROM pg_depend
JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid
JOIN pg_class AS dependent_view ON pg_rewrite.ev_class = dependent_view.oid
JOIN pg_namespace AS dependent_ns ON dependent_view.relnamespace = dependent_ns.oid
JOIN pg_class AS source_table ON pg_depend.refobjid = source_table.oid
WHERE source_table.relname = 'users';

MySQL note: MySQL also supports DROP TABLE IF EXISTS, but MySQL's CASCADE only works in foreign key definitions (ON DELETE CASCADE), not on DROP TABLE itself. In MySQL you have to manually drop dependents first.


Naming Conventions — Boring Rules That Save Your Sanity

Naming is the unsexy part of schema design that pays compound interest. A schema with consistent names is grep-able, predictable, and self-documenting. A schema with ad-hoc names is a career-long tax on every query you write.

Rule 1: snake_case everything. Tables, columns, constraints, indexes. SQL identifiers are case-insensitive unless quoted, so UserEmail and user_email refer to the same thing in an unquoted query — but the moment you quote one of them, you are locked into that exact casing forever. Stick with snake_case and you will never hit the quoting trap.

Rule 2: plural tables, singular columns. The debate between users and user is the SQL equivalent of tabs vs spaces. The pragmatic majority view is that tables contain multiple rows and should be plural (users, orders, invoices), while columns describe a single attribute of one row and should be singular (user_id, order_date). Pick one side, write it down, enforce it in code review. What kills schemas is inconsistency — a database with both user and orders is worse than either choice would have been alone.

Rule 3: never use reserved words. SQL has dozens of reserved keywords — user, order, group, table, select, desc, asc, type, key, value. Naming a column or table after one of them forces you to quote it forever ("user"), and quoting is contagious: once you quote one identifier, you quote them all. Pick account instead of user, purchase_order instead of order, role_group instead of group.

Rule 4: use consistent suffixes. Foreign keys end in _id (user_id, order_id). Timestamps end in _at (created_at, deleted_at) or _on for dates (birth_on). Booleans start with is_ or has_ (is_active, has_verified_email). Once readers learn the pattern, they can guess column names without opening the table definition.

Rule 5: name your constraints and indexes explicitly. CONSTRAINT users_email_key beats the auto-generated users_email_key1234. Index names like idx_users_email_lower tell you what they cover at a glance.

+---------------------------------------------------------------+
|           NAMING CONVENTION CHEAT SHEET                       |
+---------------------------------------------------------------+
|                                                                |
|  TABLES:       plural, snake_case       -> users, order_items |
|  COLUMNS:      singular, snake_case     -> email, created_at  |
|  PRIMARY KEY:  id                        -> (BIGINT identity) |
|  FOREIGN KEY:  <singular>_id             -> user_id, order_id |
|  TIMESTAMPS:   <verb>_at                 -> created_at        |
|  DATES:        <verb>_on                 -> birth_on          |
|  BOOLEANS:     is_/has_<noun>            -> is_active         |
|  INDEXES:      idx_<table>_<cols>        -> idx_users_email   |
|  CONSTRAINTS:  <table>_<col>_<type>      -> users_email_key   |
|                                                                |
|  AVOID RESERVED WORDS: user, order, group, type, key,         |
|                        value, select, table, desc, asc        |
|                                                                |
+---------------------------------------------------------------+

Napkin AI Visual Prompt: "Dark gradient (#0a0f1f -> #111a2e). Split diagram: TOP section labeled 'DDL — The Blueprint' shows four sky-blue (#4fc3f7) command cards in a row — CREATE DATABASE, CREATE TABLE, ALTER TABLE, DROP TABLE — each with a small white monospace code snippet. BOTTOM section labeled 'The CASCADE Footgun' shows a users table on the left with rose (#ff5c8a) arrows pointing to three dependent objects (orders, active_users view, monthly_report materialized view), all of which turn red when CASCADE fires. White monospace labels throughout. Title at top: 'Schema DDL 101'. Subtle grid pattern."


Common Mistakes

1. Running ALTER TABLE in production without reviewing locks. On a small table, ALTER TABLE ADD COLUMN is instant. On a 500-million-row table, the same command can hold an AccessExclusiveLock for minutes or hours, during which every query against the table blocks. The fix is to read the Postgres docs for the exact operation, check whether it rewrites the table, and consider split strategies — add column without default, backfill in batches, then add NOT NULL. Tools like pg_repack and frameworks like pgroll exist because DIY migrations on hot tables are how outages happen.

2. Using CASCADE without understanding what depends on the table. DROP TABLE users CASCADE feels like the fast fix when Postgres refuses to drop the table. It is actually a grenade. CASCADE drops every dependent object — foreign keys, views, materialized views, functions that reference the table — silently. Run the dependency query shown earlier, audit every result, and drop dependents explicitly in the correct order. Treat CASCADE as a last resort that requires a senior engineer's eyes.

3. Skipping IF EXISTS in idempotent scripts. Migration scripts should be safe to rerun. If a deploy fails halfway through, the recovery path is to rerun the migration from the top — and that requires CREATE TABLE IF NOT EXISTS, DROP TABLE IF EXISTS, and ALTER TABLE ... ADD COLUMN IF NOT EXISTS everywhere. Without those guards, the second run fails with "table already exists" and you are debugging a half-applied migration at 2 AM instead of shipping.

4. Naming tables or columns after reserved words. user, order, group, type, key, value, desc — every one of these will eventually collide with a SQL keyword or a tool's assumption. You will be forced to quote the identifier ("user") forever, and one day a junior will write SELECT user FROM user and lose an afternoon. Pick account, purchase_order, role_group, data_type, api_key, attribute_value, description — non-reserved synonyms exist for every reserved word.

5. Running ad-hoc DDL without a migration tool. Every DDL change in production should come from a versioned migration file committed to source control. Running ALTER TABLE manually in a psql session on prod means no review, no rollback plan, no audit trail, and no way to reproduce the state of the schema in staging. Use Flyway, Liquibase, Alembic, Prisma Migrate, Sqitch, or whatever your stack offers — but use something. "Migrations are overkill for a small project" is how small projects grow into unmaintainable ones.


Interview Questions

1. "What is the difference between DDL and DML, and why does it matter that Postgres treats DDL as transactional while MySQL does not?"

DDL (Data Definition Language) changes the structure of the database — CREATE, ALTER, DROP, TRUNCATE. DML (Data Manipulation Language) changes the contents of existing tables — SELECT, INSERT, UPDATE, DELETE. The distinction matters because they run at different levels of the stack and have very different risk profiles. DDL is rare, reviewed, and dangerous; DML is frequent, automated, and usually safe. The transactional difference is critical for migration scripts: in PostgreSQL you can wrap multiple DDL statements in BEGIN / COMMIT and roll them all back atomically if any step fails, which means a failed migration leaves the schema in exactly the state it was before. In MySQL, most DDL auto-commits each statement, so a migration that fails halfway leaves the database in an inconsistent partial state that you have to unwind manually. This is one of the main reasons teams running schema-heavy applications tend to choose Postgres.

2. "Explain what happens when you run DROP TABLE users CASCADE. How is it different from DROP TABLE users RESTRICT?"

DROP TABLE users defaults to RESTRICT, which tells Postgres to refuse the drop if any other object in the database depends on the table — a foreign key, a view, a materialized view, a function, a trigger. You get an error listing the blockers, and the table stays. DROP TABLE users CASCADE tells Postgres to drop the table and every dependent object along with it. That sounds convenient, but it is silently destructive: it will drop foreign key constraints on other tables, it will drop views that reference the table, and it will drop materialized views that joined it. The dependents are gone with no confirmation and no list. In production, you should almost never use CASCADE without first running a dependency query against pg_depend to see exactly what will be affected, and then dropping dependents explicitly in the right order. CASCADE is a grenade that looks like a shortcut.

3. "Why should every production migration script use IF EXISTS and IF NOT EXISTS clauses?"

Because production deploys fail halfway through, and the recovery path needs to be to rerun the migration from the top. If the migration says CREATE TABLE users (...) and the first attempt already created the table before crashing, the rerun fails with "relation users already exists" and you are stranded. With CREATE TABLE IF NOT EXISTS users (...), the rerun is a no-op for that statement and the migration continues to the next step. The same logic applies to DROP TABLE IF EXISTS and ALTER TABLE ... ADD COLUMN IF NOT EXISTS. The pattern is called idempotence — the script produces the same end state whether you run it once or ten times. Idempotent migrations are a professional baseline; without them, every failed deploy requires manual schema surgery, which is exactly the wrong thing to be doing in an outage.

4. "Why should you use BIGINT / BIGSERIAL for primary keys instead of INT / SERIAL?"

Because INT is a 32-bit signed integer with a maximum value of about 2.1 billion. That sounds like a lot until you realize that auto-increment sequences burn IDs on every rolled-back insert, every failed transaction, and every test run against the same sequence. Busy OLTP tables can chew through a billion IDs in a few years without ever hitting a billion rows. When the sequence overflows, every new insert fails and the fix is a painful ALTER COLUMN TYPE BIGINT that rewrites the table and every index on it — hours of downtime on a large table. The four bytes per row you save with INT are worthless compared to the migration pain. Start with BIGINT (or BIGINT GENERATED ALWAYS AS IDENTITY in modern Postgres) on day one and never think about it again. The same rule applies to MySQL: BIGINT UNSIGNED AUTO_INCREMENT.

5. "What are PostgreSQL schemas and how are they different from MySQL databases?"

A PostgreSQL schema is a namespace inside a database that holds tables, views, functions, and types. A single Postgres database can contain many schemas, and you reference objects with a qualified name like billing.invoices or let the search_path resolve unqualified names against a list of schemas. The default search path is "$user", public, which is why most tutorials just dump everything into the public schema. In MySQL, the terms DATABASE and SCHEMA are literal synonyms — CREATE DATABASE foo and CREATE SCHEMA foo do the exact same thing, and there is no second level of namespace inside a MySQL database. This means Postgres lets you organize a single database into logical groups (billing, identity, audit) with a single connection, while in MySQL you would need either a flat naming scheme (billing_invoices) or multiple connections to different databases. The practical upshot is that Postgres schemas are useful for multi-tenant isolation, module boundaries, and extension installation, whereas MySQL users typically reach for table-name prefixes instead.


Quick Reference — DDL Cheat Sheet

+---------------------------------------------------------------+
|           DDL COMMANDS CHEAT SHEET                            |
+---------------------------------------------------------------+
|                                                                |
|  CREATE DATABASE shop_api                                      |
|    WITH OWNER = admin ENCODING = 'UTF8';                       |
|                                                                |
|  CREATE SCHEMA IF NOT EXISTS billing;                          |
|                                                                |
|  CREATE TABLE IF NOT EXISTS users (                            |
|    id         BIGSERIAL PRIMARY KEY,                           |
|    email      TEXT NOT NULL UNIQUE,                            |
|    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()               |
|  );                                                            |
|                                                                |
|  ALTER TABLE users ADD COLUMN phone TEXT;                      |
|  ALTER TABLE users DROP COLUMN IF EXISTS phone;                |
|  ALTER TABLE users RENAME COLUMN full_name TO display_name;    |
|  ALTER TABLE users ALTER COLUMN id TYPE BIGINT;                |
|  ALTER TABLE users RENAME TO accounts;                         |
|                                                                |
|  DROP TABLE IF EXISTS sessions;          -- safe default       |
|  DROP TABLE IF EXISTS users CASCADE;     -- dangerous          |
|  DROP SCHEMA IF EXISTS billing CASCADE;                        |
|  DROP DATABASE IF EXISTS shop_api;                             |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           KEY RULES                                            |
+---------------------------------------------------------------+
|                                                                |
|  1. Use BIGINT / BIGSERIAL for primary keys, always            |
|  2. Use TIMESTAMPTZ for timestamps, never TIMESTAMP            |
|  3. Use TEXT in Postgres, not VARCHAR(n)                       |
|  4. snake_case tables and columns, plural tables               |
|  5. Avoid reserved words (user, order, group, type)            |
|  6. IF EXISTS / IF NOT EXISTS on every migration DDL           |
|  7. Audit dependencies before CASCADE                          |
|  8. Name your constraints and indexes explicitly               |
|  9. Every DDL change goes through a versioned migration tool   |
|  10. Test ALTER TABLE on a copy of prod data first             |
|                                                                |
+---------------------------------------------------------------+
CommandPurposeSafety KeywordGotcha
CREATE DATABASETop-level containerIF NOT EXISTSCannot run inside a transaction
CREATE SCHEMANamespace inside a databaseIF NOT EXISTSpublic schema permissions vary by version
CREATE TABLEDefine a tableIF NOT EXISTSChoose types carefully — hard to change later
ALTER TABLE ADD COLUMNAdd a columnIF NOT EXISTSVolatile defaults rewrite the table
ALTER TABLE DROP COLUMNRemove a columnIF EXISTSData is gone without a backup
ALTER TABLE ALTER TYPEChange column typen/aCan rewrite every row and index
ALTER TABLE RENAMERename column or tablen/aBreaks every dependent query instantly
DROP TABLEDelete a tableIF EXISTS + RESTRICTCASCADE silently drops dependents
DROP SCHEMADelete a namespaceIF EXISTS + CASCADETakes every table inside with it
DROP DATABASEDelete a databaseIF EXISTSMust disconnect all sessions first

Prev: Lesson 1.3 -- Constraints Next: Lesson 2.1 -- Primary Key


This is Lesson 1.4 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

On this page