Database Interview Prep
Normalization

First Normal Form (1NF)

Atomic Values, No Repeating Groups

LinkedIn Hook

"Your users table has a phone_numbers column with comma-separated values. Finding everyone with a +44 number now requires LIKE '%+44%' and a prayer."

First Normal Form is the entry rule of relational design. Break it, and every query downstream gets uglier: filters become string searches, joins become regex parties, indexes stop helping, and counting "how many tags does this post have" turns into a LENGTH(tags) - LENGTH(REPLACE(tags, ',', '')) + 1 horror show.

The rule itself is brutally simple: one value per cell, no repeating groups, no hidden lists encoded inside strings. A phone column holds one phone number. A tags column holds one tag — or it does not exist, and tags live in their own table. There is no third option in textbook 1NF.

Then PostgreSQL showed up with native arrays and JSONB and made everyone uncomfortable. Is text[] a 1NF violation? Technically yes, by Codd's original definition. Practically, it depends — and in this lesson I will show you exactly when arrays and JSONB earn their keep, and when they are just CSV-in-disguise wearing a fancier hat.

In Lesson 10.2, I break down 1NF: what atomicity really means, how to spot violations, how to migrate a comma-separated mess into a child table, and the modern PostgreSQL exception debate.

Read the full lesson -> [link]

#PostgreSQL #DatabaseDesign #Normalization #1NF #SQL #BackendDevelopment #InterviewPrep


First Normal Form (1NF) thumbnail


What You'll Learn

  • What "atomic value" actually means in the relational model and why it matters
  • How to spot 1NF violations: comma-separated lists, repeating columns, embedded structures
  • Two ways to fix a 1NF violation: split into rows or split into a child table
  • How a 1NF violation breaks indexing, filtering, joining, and constraint enforcement
  • The PostgreSQL array and JSONB exception debate — when modern types are acceptable
  • MySQL differences: no native arrays, JSON column behavior, generated columns
  • Migration patterns to convert a comma-separated column into a proper child table

The Spice Jar Analogy — Why One Value Per Cell

Imagine your kitchen spice rack. Every jar has one label and holds one spice. The "cumin" jar holds cumin. The "paprika" jar holds paprika. When a recipe calls for two teaspoons of cumin, you grab the cumin jar — done. Counting how many spices you own is just counting jars. Throwing out the expired cinnamon means removing one jar. Replacing the salt brand means swapping one jar.

Now imagine a lazy cook who decides to save shelf space by putting all the red spices into one jar labeled "paprika, cayenne, chili powder, smoked paprika." When the recipe calls for cayenne, you have to dig through the mixture, eyeball it, and hope you got the right amount. Replacing the chili powder brand means dumping the whole jar and starting over. Counting how many red spices you own requires reading the label and parsing it. Adding "ancho" to the jar requires editing the label string.

That second jar is a relational table that violates First Normal Form. The cell holds a list, the column name lies about what is inside, and every operation on the data requires parsing a string the database cannot help you with. 1NF says: one ingredient per jar, one value per cell, period. It is the most basic rule in relational design, and it is the foundation every other normal form stands on.

+---------------------------------------------------------------+
|           VIOLATING 1NF (The Mess)                            |
+---------------------------------------------------------------+
|                                                                |
|  users                                                         |
|  +----+--------+-----------------------------+                 |
|  | id | name   | phone_numbers               |                 |
|  +----+--------+-----------------------------+                 |
|  | 1  | Alice  | +1-555-1111, +1-555-1112    |                 |
|  | 2  | Bob    | +44-20-7946-0958            |                 |
|  | 3  | Carol  | +1-555-3333, +1-555-3334,   |                 |
|  |    |        | +1-555-3335                 |                 |
|  +----+--------+-----------------------------+                 |
|                                                                |
|  Problems:                                                     |
|   - Cannot index individual phone numbers                      |
|   - "Find users with +44 number" needs LIKE '%+44%'            |
|   - Cannot enforce phone format with CHECK constraint          |
|   - Counting numbers per user requires string parsing          |
|   - Adding/removing one number rewrites the whole cell         |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           1NF COMPLIANT (Atomic Values)                       |
+---------------------------------------------------------------+
|                                                                |
|  users                  user_phones                            |
|  +----+--------+        +----+---------+----------------+      |
|  | id | name   |        | id | user_id | phone          |      |
|  +----+--------+        +----+---------+----------------+      |
|  | 1  | Alice  |        | 1  | 1       | +1-555-1111    |      |
|  | 2  | Bob    |        | 2  | 1       | +1-555-1112    |      |
|  | 3  | Carol  |        | 3  | 2       | +44-20-7946... |      |
|  +----+--------+        | 4  | 3       | +1-555-3333    |      |
|                         | 5  | 3       | +1-555-3334    |      |
|                         | 6  | 3       | +1-555-3335    |      |
|                         +----+---------+----------------+      |
|                                                                |
|  Wins:                                                         |
|   - Index on user_phones(phone) makes lookups O(log n)         |
|   - CHECK (phone ~ '^\+[0-9-]+$') enforces format              |
|   - COUNT(*) per user_id gives accurate count                  |
|   - Insert/delete one row to add/remove one number             |
|                                                                |
+---------------------------------------------------------------+

Napkin AI Visual Prompt: "Dark gradient (#0a0f1f -> #111a2e). Split comparison: LEFT side labeled 'Violates 1NF' shows a database table with one cell containing comma-separated values highlighted in red with a hot pink (#ff5c8a) X overlay. RIGHT side labeled '1NF Compliant' shows two clean tables linked by a cyan (#4fc3f7) foreign key arrow, each row holding a single atomic value. White monospace labels. Cyan arrow showing the migration: parse -> split -> insert."


What "Atomic" Actually Means

Edgar Codd, who invented the relational model in 1970, defined a relation as 1NF if every attribute value is atomic — meaning the database treats it as an indivisible unit. The DBMS does not look inside the value, does not parse it, and does not understand its internal structure. A VARCHAR(50) holding 'alice@example.com' is atomic from the database's point of view, even though humans know it has a local part and a domain part. The point is not whether you can decompose the value — it is whether the DBMS does.

A value violates 1NF when the database is forced to peek inside the cell to answer a query. The classic offenders:

  • Comma-separated lists in a string column. 'red,green,blue' is one VARCHAR to PostgreSQL but three values to your application. Every query that asks "which rows contain green" must do string matching the database cannot index well.
  • Repeating groups as numbered columns. phone1, phone2, phone3. This is the same sin in a different costume — you have hardcoded a maximum count, denied users a fourth phone, and made queries like "find any user with a +44 number" require checking every column with OR.
  • Hidden structure encoded in one column. A tags column holding 'JSON-like' text such as '{red,green,blue}' that the application parses. The database has no idea this is a list; it just sees a string.

The litmus test is straightforward: if your query needs to call SUBSTRING, SPLIT_PART, LIKE '%x%', or a regex to find rows that "contain" something, you are almost certainly violating 1NF. The relational engine is built around set operations on rows, not string slicing inside cells.


Example 1 — The Comma-Separated Tags Disaster (Violation)

Here is a posts table that stores tags as a comma-separated list. It looks innocent. Every query against it is painful.

-- VIOLATES 1NF: tags is a list crammed into one VARCHAR
CREATE TABLE posts_bad (
  id     SERIAL PRIMARY KEY,
  title  TEXT NOT NULL,
  tags   TEXT  -- "postgres,sql,performance" — a list pretending to be a value
);

INSERT INTO posts_bad (title, tags) VALUES
  ('Indexing Basics',     'postgres,sql,performance'),
  ('Vacuum Internals',    'postgres,maintenance'),
  ('SQL Joins Explained', 'sql,joins,beginner'),
  ('Query Plans',         'postgres,sql,performance,explain');

Now try to answer the natural question: "How many posts are tagged performance?"

-- The only way: string matching, which is fragile and slow
SELECT COUNT(*) FROM posts_bad
WHERE ',' || tags || ',' LIKE '%,performance,%';

--  count
-- -------
--      2
-- (1 row)

The leading and trailing commas are a hack to avoid matching 'high-performance' when you wanted 'performance'. A B-tree index on tags is useless for this query because the prefix is a wildcard. Want all distinct tags across the system? You need string_to_array or regexp_split_to_table to break the cell apart at query time, paying the cost on every read.

-- Counting distinct tags requires exploding every row
SELECT DISTINCT unnest(string_to_array(tags, ',')) AS tag
FROM posts_bad
ORDER BY tag;

--      tag
-- -------------
--  beginner
--  explain
--  joins
--  maintenance
--  performance
--  postgres
--  sql
-- (7 rows)

The killer detail: there is no constraint preventing 'postgres, sql' (with a stray space) from being inserted alongside 'postgres,sql'. The database cannot enforce uniqueness, format, or referential integrity on the tags inside the string. Every consumer must agree on the exact serialization, and every query must parse it.


Example 2 — Fix #1: Split Into Rows With a Child Table (1NF Compliant)

The textbook fix is to extract the list into its own table where each list element is a row. This is 1NF in its purest form.

-- 1NF COMPLIANT: one row per (post, tag) pair
CREATE TABLE posts (
  id    SERIAL PRIMARY KEY,
  title TEXT NOT NULL
);

CREATE TABLE tags (
  id   SERIAL PRIMARY KEY,
  name TEXT UNIQUE NOT NULL  -- enforced uniqueness, no duplicates possible
);

CREATE TABLE post_tags (
  post_id INT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
  tag_id  INT NOT NULL REFERENCES tags(id)  ON DELETE CASCADE,
  PRIMARY KEY (post_id, tag_id)  -- prevents duplicate tagging
);

CREATE INDEX idx_post_tags_tag ON post_tags(tag_id);

INSERT INTO posts (title) VALUES
  ('Indexing Basics'),
  ('Vacuum Internals'),
  ('SQL Joins Explained'),
  ('Query Plans');

INSERT INTO tags (name) VALUES
  ('postgres'), ('sql'), ('performance'),
  ('maintenance'), ('joins'), ('beginner'), ('explain');

INSERT INTO post_tags VALUES
  (1, 1), (1, 2), (1, 3),
  (2, 1), (2, 4),
  (3, 2), (3, 5), (3, 6),
  (4, 1), (4, 2), (4, 3), (4, 7);

The same "how many posts are tagged performance" question becomes a clean indexed join:

SELECT COUNT(*) AS post_count
FROM post_tags pt
JOIN tags t ON t.id = pt.tag_id
WHERE t.name = 'performance';

--  post_count
-- ------------
--           2
-- (1 row)

This query uses the unique index on tags.name to find the tag id, then the index on post_tags(tag_id) to count matching rows. Both lookups are logarithmic. There is no string parsing, no LIKE, and no risk of accidentally matching 'high-performance'. Want to rename a tag? UPDATE tags SET name = 'perf' WHERE name = 'performance' — one row changed, every post automatically reflects it. Try doing that with the comma-separated version without a regex.

The schema also expresses real constraints. The UNIQUE on tags.name makes duplicate tag names impossible. The composite primary key on post_tags(post_id, tag_id) makes tagging the same post twice with the same tag impossible. The FOREIGN KEY references make orphan tags impossible. Each constraint is one line of DDL, enforced by the database, and free of charge at query time. None of this is achievable when the tags live inside a string.


Example 3 — Fix #2: The PostgreSQL Array Shortcut (Pragmatic, Debatable)

PostgreSQL supports native array types — text[], int[], etc. — and they are first-class citizens with their own operators, GIN index support, and dedicated query syntax. Storing tags as a text[] column looks like this:

-- PRAGMATIC: native array column, GIN-indexed for containment queries
CREATE TABLE posts_array (
  id    SERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  tags  TEXT[] NOT NULL DEFAULT '{}'
);

CREATE INDEX idx_posts_array_tags ON posts_array USING GIN (tags);

INSERT INTO posts_array (title, tags) VALUES
  ('Indexing Basics',     ARRAY['postgres', 'sql', 'performance']),
  ('Vacuum Internals',    ARRAY['postgres', 'maintenance']),
  ('SQL Joins Explained', ARRAY['sql', 'joins', 'beginner']),
  ('Query Plans',         ARRAY['postgres', 'sql', 'performance', 'explain']);

-- Containment query, indexed by GIN, fast and clean
SELECT COUNT(*) FROM posts_array WHERE tags @> ARRAY['performance'];

--  count
-- -------
--      2
-- (1 row)

The query is fast, the syntax is concise, and the GIN index makes containment lookups efficient even on millions of rows. So is this a 1NF violation or not?

By Codd's original 1970 definition: yes. A cell holding a list of values is not atomic. The database can look inside, the values are individually addressable, and the column type explicitly admits a collection.

By practical modern interpretation: it depends. The relational model is more nuanced now. If the array elements are:

  • Truly attributes of the row itself (a list of tags that only describe this post and have no independent existence)
  • Never need their own constraints, foreign keys, or relationships
  • Always accessed together, never joined to from elsewhere
  • Bounded in size (a handful of tags, not thousands)

...then a text[] column is a defensible, idiomatic PostgreSQL choice. You lose the ability to put a foreign key on each tag, you cannot easily enforce that each tag exists in a master tags table, and you cannot attach metadata to a tag-post relationship (like "who applied this tag and when"). Those are real costs.

If you ever need any of those things, you must migrate to a child table. The rule of thumb most experienced PostgreSQL teams use: start with a child table; reach for text[] only when you have a concrete reason and have weighed the lost constraints. JSONB has the same trade-off in even stronger form — it is great for genuinely schemaless or semi-structured data (audit logs, third-party API payloads), but it is a 1NF escape hatch when used as a dumping ground for what should be relational columns.


Example 4 — Migrating a Legacy CSV Column to a Child Table

You have inherited the bad schema from Example 1 and need to migrate without losing data. Here is the standard PostgreSQL pattern using regexp_split_to_table:

-- Starting state: posts_bad with comma-separated tags column

-- 1. Create the new normalized tables
CREATE TABLE tags (
  id   SERIAL PRIMARY KEY,
  name TEXT UNIQUE NOT NULL
);

CREATE TABLE post_tags (
  post_id INT NOT NULL REFERENCES posts_bad(id) ON DELETE CASCADE,
  tag_id  INT NOT NULL REFERENCES tags(id)      ON DELETE CASCADE,
  PRIMARY KEY (post_id, tag_id)
);

-- 2. Populate tags by exploding every CSV cell and deduplicating
INSERT INTO tags (name)
SELECT DISTINCT trim(tag) AS name
FROM posts_bad,
     regexp_split_to_table(tags, ',') AS tag
WHERE trim(tag) <> ''
ON CONFLICT (name) DO NOTHING;

-- 3. Populate the join table from the same explosion, mapped to tag ids
INSERT INTO post_tags (post_id, tag_id)
SELECT p.id, t.id
FROM posts_bad p,
     regexp_split_to_table(p.tags, ',') AS tag_name
JOIN tags t ON t.name = trim(tag_name)
ON CONFLICT DO NOTHING;

-- 4. Verify the migration counted right
SELECT
  (SELECT COUNT(*) FROM tags)      AS distinct_tags,
  (SELECT COUNT(*) FROM post_tags) AS total_assignments;

--  distinct_tags | total_assignments
-- ---------------+-------------------
--              7 |                12
-- (1 row)

-- 5. Drop the old column once the application is reading from the new tables
ALTER TABLE posts_bad DROP COLUMN tags;
ALTER TABLE posts_bad RENAME TO posts;

regexp_split_to_table is the PostgreSQL primitive for converting a delimited string into rows — exactly the operation 1NF compliance requires. The ON CONFLICT clauses make the script idempotent so you can run it twice without errors. In production, wrap it in a transaction, run it on a copy of the data first, and double-check the row counts before dropping the original column.


PostgreSQL vs MySQL Notes

PostgreSQL gives you native arrays (text[], int[]), JSONB with rich operators and GIN indexing, and regexp_split_to_table for CSV explosion. The temptation to violate 1NF "just a little" with a text[] column is real and sometimes justified.

MySQL has no native array type. The closest equivalent is the JSON column type (since MySQL 5.7), which is functional but more cumbersome than PostgreSQL's JSONB. MySQL teams more often use the textbook child-table approach because the language does not offer a tempting shortcut. To index inside a JSON column, MySQL uses generated columns with secondary indexes — ALTER TABLE posts ADD tag_count INT GENERATED ALWAYS AS (JSON_LENGTH(tags)) STORED, ADD INDEX (tag_count). It works, but it is more ceremony than PostgreSQL's GIN-on-array.

The bigger picture: 1NF is dialect-neutral. The textbook definition predates both PostgreSQL arrays and JSON columns by decades. Whether your specific engine offers an escape hatch does not change the underlying trade-offs. Constraints, foreign keys, joins, and indexes all work better when values are atomic. Pick the escape hatch only when you have measured the alternative and decided the trade is worth it.


Common Mistakes

1. Treating the comma-separated column as "good enough" because the app handles parsing. The application is not the only consumer. Reporting tools, analytics queries, future developers, and the database optimizer itself all see a string. The moment someone needs to filter, join, or aggregate on the list elements, you are paying tax forever. Fix it the first time you write a LIKE '%,x,%' query — that is the universe telling you the schema is wrong.

2. Numbered columns like phone1, phone2, phone3. This is 1NF violation in disguise. You have hardcoded a maximum count, made queries use ugly OR chains, and forced every count of "how many phones does Alice have" into a CASE expression. The fix is identical: a child user_phones table with one row per phone number.

3. Storing JSON inside a TEXT column instead of a JSONB column (in PostgreSQL). If you have decided to denormalize into a document, at least use the right type. JSONB is binary, indexable, validated at insert time, and supports operators like @> and ?. TEXT containing JSON is opaque to the database and the worst of both worlds — you lost relational benefits and you did not gain document-store benefits.

4. Confusing 1NF with "do not use any composite types ever." 1NF is about atomic values, not about whether you happen to store a date or an IP address or a UUID. A DATE column is atomic to the database even though humans see year, month, and day inside it. The test is whether the database treats the value as one indivisible unit at query time, not whether you can subdivide it conceptually.

5. Migrating to a child table but leaving the CSV column "for compatibility." Now you have two sources of truth that can drift. The new table is the authoritative store the moment the migration runs — drop the old column once the application is updated, or you will spend the next year fielding "why does the report not match the API" tickets.


Interview Questions

1. "What does 'atomic' mean in the context of First Normal Form, and why is it important?"

Atomic means the database treats the value as a single indivisible unit. The DBMS does not look inside the value, does not parse it, and does not understand its internal structure. A VARCHAR holding 'alice@example.com' is atomic even though humans know it has a local part and a domain — what matters is that the database engine treats the entire string as one value at query time. A column violates atomicity when queries are forced to peek inside the cell to find what they want, typically using SUBSTRING, LIKE '%x%', SPLIT_PART, or regex. Atomicity matters because the entire relational query engine — joins, indexes, constraints, set operations — is built around set operations on rows of atomic values. The moment a cell holds a list, indexes stop helping, constraints cannot enforce element-level rules, foreign keys are impossible, and every query that touches the list pays a parsing cost. 1NF is the foundation that makes the rest of the relational model work.

2. "How would you fix a users.phone_numbers column that stores comma-separated phone numbers?"

I would extract the phone numbers into a child table, typically named user_phones, with columns id, user_id, phone, and possibly metadata like phone_type ('mobile', 'work') and is_primary. The user_id would be a foreign key referencing users(id) with ON DELETE CASCADE so deleting a user removes their phone numbers automatically. I would add a CHECK constraint on the phone column to enforce a format like E.164, and a UNIQUE constraint on (user_id, phone) to prevent duplicates per user. Then I would migrate the existing data using regexp_split_to_table(phone_numbers, ',') to explode each row's CSV into multiple rows in the new table, trim whitespace, and insert them with the original user_id. Once the application is reading from the new table, I would drop the original phone_numbers column. The result: indexable, constraint-enforced, joinable phone numbers with proper referential integrity, and queries like "find all users with a +44 number" become a simple equality filter on an indexed column.

3. "Are PostgreSQL array columns a violation of 1NF? Justify your answer."

By Codd's original 1970 definition, yes — an array column holds a collection of values, which is not atomic. The database can address individual elements, the type system explicitly admits a list, and operators like @> work on the contents. By that strict reading, text[] is not 1NF. But the modern relational community is more nuanced. If the array elements have no independent existence outside the row, never need their own constraints or foreign keys, are always accessed together, and are bounded in size, then a text[] column is a defensible idiomatic PostgreSQL choice — and the GIN index on the array makes containment queries fast. The trade-off is real: you lose the ability to put a foreign key on each element, you cannot enforce that each element exists in a master table, and you cannot attach metadata to the element-row relationship. My rule of thumb is to start with a child table by default and reach for text[] only when there is a concrete reason and the lost constraints are acceptable. JSONB has the same trade-off in even stronger form. The question is not "is it technically 1NF" — it is "have I weighed the cost of giving up relational guarantees on this data."

4. "What is wrong with having columns named phone1, phone2, phone3 in a users table?"

It is a 1NF violation in disguise — a repeating group spread across columns instead of crammed into one cell. The first problem is hardcoded capacity: a user with four phone numbers cannot be represented without a schema change. The second problem is query ugliness: "find any user with a +44 number" becomes WHERE phone1 LIKE '+44%' OR phone2 LIKE '+44%' OR phone3 LIKE '+44%', which is unmaintainable and impossible to index well. The third problem is that NULL columns waste space and confuse aggregations — counting phone numbers per user requires a CASE expression for every column. The fourth problem is that adding a phone in the application means finding the first NULL slot, which is racy under concurrent updates. The fix is the same as for comma-separated lists: a user_phones child table with one row per phone number, a foreign key to users, and an index on the phone column. Now any number of phones is supported, every phone is independently queryable, and the schema does not need to change to support a user with five phones.

5. "When is JSONB an acceptable design choice in PostgreSQL, and when is it a 1NF anti-pattern?"

JSONB is acceptable when the data is genuinely schemaless or semi-structured and you do not need relational guarantees on the inner fields. Good fits: third-party API payloads you store verbatim for audit purposes, user-defined custom fields where the schema varies per tenant, event logs with heterogeneous attributes, and configuration blobs that are read and written as a whole document. In those cases JSONB gives you indexability via GIN, query operators like @> and ?, and validation that the data is at least syntactically valid JSON. JSONB becomes an anti-pattern when it is used as a dumping ground for fields that should have been proper columns — for example, storing a user's email, phone, and address inside a JSONB contact column. You lose typed constraints, foreign keys, indexes on individual fields without ceremony, and the ability for tools to discover the schema. The litmus test: if you find yourself querying "find all users where data->>'email' = 'x'" frequently, those fields wanted to be columns. JSONB is for genuinely document-shaped data, not for hiding a relational model behind a single column.


Quick Reference — First Normal Form Cheat Sheet

+---------------------------------------------------------------+
|           1NF CHEAT SHEET                                     |
+---------------------------------------------------------------+
|                                                                |
|  THE RULE:                                                     |
|  - One value per cell                                          |
|  - No repeating groups (phone1, phone2, phone3)                |
|  - No comma-separated lists in a string column                 |
|  - No hidden structure encoded inside one column               |
|                                                                |
|  HOW TO SPOT A VIOLATION:                                      |
|  - Query uses LIKE '%x%' to find list elements                 |
|  - Query uses SPLIT_PART, SUBSTRING, or regex on a column      |
|  - Numbered columns: thing1, thing2, thing3                    |
|  - Counting elements requires string parsing                   |
|  - You cannot put a CHECK or FK on the inner values            |
|                                                                |
|  TWO FIXES:                                                    |
|  1. Split into rows: child table with one row per element      |
|  2. Split into columns: only when count is fixed and small     |
|                                                                |
|  POSTGRES ESCAPE HATCHES (use with care):                      |
|  - text[] + GIN index for bounded, dependent lists             |
|  - JSONB + GIN for genuinely schemaless documents              |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           VIOLATION -> FIX QUICK MAP                          |
+---------------------------------------------------------------+
|                                                                |
|  'red,green,blue' in TEXT  -> child table OR text[]            |
|  phone1, phone2, phone3    -> user_phones child table          |
|  JSON inside a TEXT column -> JSONB column (or child table)    |
|  'first last' in name      -> first_name + last_name columns   |
|  'k1=v1;k2=v2' settings    -> settings child table OR JSONB    |
|                                                                |
+---------------------------------------------------------------+
ConcernViolates 1NF1NF Compliant
Multiple phones per userphone_numbers TEXT (CSV)user_phones child table
Tags on a posttags TEXT (CSV)post_tags join table or text[]
User addressesaddress TEXTstreet, city, zip columns
Custom user fieldsextras TEXT (key=val)JSONB column (justified)
Survey answersanswers TEXT (CSV)survey_answers child table
Indexable lookupsLIKE '%x%' scanB-tree or GIN index
Element constraintsNone possibleCHECK, UNIQUE, FK
Counting elementsString parsingCOUNT(*) on join

Prev: Lesson 10.1 -- What Is Normalization? Next: Lesson 10.3 -- Second Normal Form


This is Lesson 10.2 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

On this page