Database Interview Prep
Filtering, Sorting and Aggregation

WHERE Clause

Filtering Rows with Predicates and Three-Valued Logic

LinkedIn Hook

"Your query returned zero rows. The data is definitely there. You checked twice. And yet WHERE status = NULL refuses to match a single row."

Nine out of ten SQL bugs in a junior engineer's first year come from the WHERE clause. Not from missing indexes. Not from exotic joins. From = vs IS, from AND binding tighter than OR, from NOT IN (subquery) silently eating every row because one value in the subquery happened to be NULL.

The WHERE clause looks like a boolean filter, but it is not boolean -- it is three-valued. Every predicate evaluates to TRUE, FALSE, or UNKNOWN, and only rows that evaluate to TRUE make it into your result set. UNKNOWN is not FALSE. NULL = NULL is not TRUE. NOT (UNKNOWN) is still UNKNOWN. Once you internalize this, half of your "why is this query broken" moments disappear.

In Lesson 4.1, I walk through comparison operators, AND/OR/NOT and their precedence trap, IN vs chained OR, the NOT IN + NULL ambush, BETWEEN's inclusive ends, and the IS NULL rule that trips up everyone who learned = first.

Read the full lesson -> [link]

#SQL #Databases #PostgreSQL #BackendDevelopment #InterviewPrep #ThreeValuedLogic


WHERE Clause thumbnail


What You'll Learn

  • The six comparison operators (=, <>, <, >, <=, >=) and what they return when operands are NULL
  • How AND, OR, and NOT combine predicates -- and why AND binds tighter than OR causes silent bugs
  • Parentheses discipline: when to group explicitly even if the language does not require it
  • IN (list) as sugar for chained OR, and how it differs in readability and performance
  • The NOT IN + NULL ambush: why one NULL in a subquery returns zero rows
  • BETWEEN a AND b is inclusive on both ends -- and the off-by-one trap when people assume otherwise
  • IS NULL / IS NOT NULL and why col = NULL never matches anything
  • Three-valued logic (TRUE/FALSE/UNKNOWN) and the truth tables for AND, OR, NOT
  • The difference between WHERE and ON in outer joins -- a common interview follow-up

The Search Form Analogy -- Why WHERE is a Filter, Not a Loop

Imagine a real estate search form. You tick "3 bedrooms," set the price range to "$400k-$600k," pick "San Francisco" from a city dropdown, and leave the "has garage" checkbox untouched. You click Search. The site does not loop through listings one at a time in your browser -- it sends a single filter specification to the server, and the server returns only the listings that match every filter you set. The empty "has garage" checkbox is not "no" and not "yes" -- it is unknown, and the server treats it as "don't filter on this."

That is exactly what a WHERE clause does. The table is the full listing catalog. Each predicate (bedrooms = 3, price BETWEEN 400000 AND 600000, city = 'San Francisco') is one checkbox on the form. AND means "every box must pass"; OR means "at least one box must pass"; NOT flips the expectation. And NULL is the empty checkbox -- not a value, but an absence of a value, which is why comparing anything to NULL always gives back "unknown" instead of "yes" or "no."

+---------------------------------------------------------------+
|           HOW WHERE ACTUALLY RUNS                             |
+---------------------------------------------------------------+
|                                                                |
|  For each row in the table:                                    |
|    evaluate the predicate -> returns TRUE, FALSE, or UNKNOWN   |
|    if TRUE    -> include in result                             |
|    if FALSE   -> skip                                          |
|    if UNKNOWN -> skip (treated like FALSE for filtering)       |
|                                                                |
|  Key insight: UNKNOWN is NOT the same as FALSE in logic,       |
|  but the WHERE clause throws it away just like FALSE.          |
|  This asymmetry is the source of most NULL-related bugs.       |
|                                                                |
+---------------------------------------------------------------+

Napkin AI Visual Prompt: "Dark gradient (#0a0f1f -> #111a2e). A large search form on the left with four filter checkboxes in white monospace: bedrooms, price, city, garage. An arrow labeled 'WHERE' in sky blue (#4fc3f7) pointing to a database table on the right. Rows flowing through the arrow are colored: green checkmark rows pass, rose (#ff5c8a) X rows are filtered out, grey question-mark rows labeled NULL also filtered out. Amber note at bottom: 'Only TRUE passes. FALSE and UNKNOWN are both dropped.' White monospace labels throughout."


Setting Up the Sample Table

Every example in this lesson runs against the same employees table. Copy and paste this block into any PostgreSQL instance to follow along.

-- Create the sample table
CREATE TABLE employees (
    id           SERIAL PRIMARY KEY,
    name         TEXT NOT NULL,
    department   TEXT,              -- nullable on purpose
    salary       NUMERIC(10, 2),
    hired_on     DATE,
    manager_id   INTEGER            -- NULL for the CEO
);

INSERT INTO employees (name, department, salary, hired_on, manager_id) VALUES
    ('Ada Lovelace',     'Engineering', 145000, '2019-03-12', NULL),
    ('Linus Torvalds',   'Engineering', 160000, '2018-11-04', 1),
    ('Grace Hopper',     'Engineering',  95000, '2021-06-01', 1),
    ('Margaret Hamilton','Research',    130000, '2020-01-15', 1),
    ('Katherine Johnson','Research',     88000, '2022-09-20', 4),
    ('Alan Turing',      NULL,          110000, '2020-07-07', 1),  -- dept NULL
    ('Dennis Ritchie',   'Engineering',  72000,  NULL,        2);  -- hired NULL

The two deliberate NULLs (Alan Turing's department, Dennis Ritchie's hire date) are the stars of every three-valued-logic example below.


Comparison Operators -- The Six Core Predicates

SQL has exactly six comparison operators that work on ordered types (numbers, dates, strings). They are the building blocks of every WHERE clause.

-- Equals
SELECT name, salary FROM employees WHERE salary = 95000;
-- Output:
--  name          | salary
-- ---------------+--------
--  Grace Hopper  | 95000

-- Not equals (two spellings: <> is SQL standard, != is widely supported)
SELECT name, department FROM employees WHERE department <> 'Engineering';
-- Output:
--  name              | department
-- -------------------+-------------
--  Margaret Hamilton | Research
--  Katherine Johnson | Research
-- NOTE: Alan Turing (department NULL) is NOT returned.
-- 'NULL <> Engineering' evaluates to UNKNOWN, not TRUE.

-- Less than / greater than
SELECT name, salary FROM employees WHERE salary < 100000;
-- Output:
--  name              | salary
-- -------------------+--------
--  Grace Hopper      | 95000
--  Katherine Johnson | 88000
--  Dennis Ritchie    | 72000

-- Less-or-equal / greater-or-equal
SELECT name, salary FROM employees WHERE salary >= 130000;
-- Output:
--  name              | salary
-- -------------------+--------
--  Ada Lovelace      | 145000
--  Linus Torvalds    | 160000
--  Margaret Hamilton | 130000
OperatorMeaningResult with NULL operand
=EqualUNKNOWN
<>Not equal (standard)UNKNOWN
!=Not equal (synonym)UNKNOWN
<Less thanUNKNOWN
>Greater thanUNKNOWN
<=Less than or equalUNKNOWN
>=Greater than or equalUNKNOWN

The rule that matters: any comparison operator with a NULL operand returns UNKNOWN, and the WHERE clause drops UNKNOWN rows the same way it drops FALSE rows. That is why department <> 'Engineering' did not return Alan Turing even though his department is clearly not 'Engineering'.


AND / OR / NOT -- Combining Predicates

Boolean connectives combine simple comparisons into compound filters. AND requires both sides to pass, OR requires at least one side to pass, NOT flips the truth value.

-- AND: both conditions must be TRUE
SELECT name, department, salary
FROM employees
WHERE department = 'Engineering' AND salary >= 100000;
-- Output:
--  name            | department  | salary
-- -----------------+-------------+--------
--  Ada Lovelace    | Engineering | 145000
--  Linus Torvalds  | Engineering | 160000

-- OR: at least one condition must be TRUE
SELECT name, department
FROM employees
WHERE department = 'Research' OR salary > 150000;
-- Output:
--  name              | department
-- -------------------+-------------
--  Linus Torvalds    | Engineering
--  Margaret Hamilton | Research
--  Katherine Johnson | Research

-- NOT: flip the truth value (with NULL trap)
SELECT name, department
FROM employees
WHERE NOT (department = 'Engineering');
-- Output:
--  name              | department
-- -------------------+-------------
--  Margaret Hamilton | Research
--  Katherine Johnson | Research
-- Again, Alan Turing is missing: NOT (NULL = 'Engineering')
-- = NOT UNKNOWN = UNKNOWN, which the WHERE clause drops.

Operator Precedence -- The AND/OR Bug That Everyone Writes Once

NOT binds tightest, then AND, then OR. This is the same precedence as English arithmetic's * before +. It means that A OR B AND C is silently parsed as A OR (B AND C), not (A OR B) AND C. If you meant the second, and you did not type the parentheses, your query is wrong and the database will not warn you.

-- Intended: "Engineering or Research employees who earn at least 100k"
-- Buggy (missing parentheses):
SELECT name, department, salary
FROM employees
WHERE department = 'Engineering' OR department = 'Research' AND salary >= 100000;
-- Parsed as:
--   department = 'Engineering' OR (department = 'Research' AND salary >= 100000)
-- Output (WRONG -- includes low-paid engineers):
--  name              | department  | salary
-- -------------------+-------------+--------
--  Ada Lovelace      | Engineering | 145000
--  Linus Torvalds    | Engineering | 160000
--  Grace Hopper      | Engineering |  95000   <- should have been filtered
--  Margaret Hamilton | Research    | 130000
--  Dennis Ritchie    | Engineering |  72000   <- should have been filtered

-- Correct (explicit parentheses):
SELECT name, department, salary
FROM employees
WHERE (department = 'Engineering' OR department = 'Research')
  AND salary >= 100000;
-- Output:
--  name              | department  | salary
-- -------------------+-------------+--------
--  Ada Lovelace      | Engineering | 145000
--  Linus Torvalds    | Engineering | 160000
--  Margaret Hamilton | Research    | 130000

Rule of thumb: whenever a WHERE clause mixes AND and OR, wrap the OR group in parentheses even if precedence would technically give you the right answer. It costs nothing, it documents intent, and it immunizes the query against someone editing it later.


IN vs Chained OR

IN (v1, v2, v3) is shorthand for col = v1 OR col = v2 OR col = v3. Readability and maintenance are the main reasons to prefer it; query plans are usually identical.

-- Chained OR (verbose)
SELECT name, department FROM employees
WHERE department = 'Engineering'
   OR department = 'Research'
   OR department = 'Finance';

-- Equivalent IN clause (preferred)
SELECT name, department FROM employees
WHERE department IN ('Engineering', 'Research', 'Finance');
-- Output:
--  name              | department
-- -------------------+-------------
--  Ada Lovelace      | Engineering
--  Linus Torvalds    | Engineering
--  Grace Hopper      | Engineering
--  Margaret Hamilton | Research
--  Katherine Johnson | Research

IN also works with subqueries: WHERE department IN (SELECT name FROM departments WHERE active). This is where the next trap lives.


The NOT IN + NULL Ambush

NOT IN is where three-valued logic quietly eats your query. If any value in the list (or subquery) is NULL, NOT IN returns no rows at all, even if there are obviously matching rows in the table.

-- This works as expected: returns non-engineering employees
SELECT name, department FROM employees
WHERE department NOT IN ('Engineering');
-- Output:
--  name              | department
-- -------------------+-------------
--  Margaret Hamilton | Research
--  Katherine Johnson | Research
-- (Alan Turing with dept=NULL is still missing -- three-valued logic.)

-- Now watch what happens when the LIST itself contains a NULL:
SELECT name, department FROM employees
WHERE department NOT IN ('Engineering', NULL);
-- Output: (empty)
--
-- Why? NOT IN expands to:
--   department <> 'Engineering' AND department <> NULL
-- The second comparison is ALWAYS UNKNOWN (nothing equals NULL),
-- so the whole AND is at best UNKNOWN, never TRUE.
-- Zero rows satisfy an always-UNKNOWN predicate.

-- Safe alternative: NOT EXISTS, which handles NULL correctly
SELECT e.name, e.department FROM employees e
WHERE NOT EXISTS (
    SELECT 1 FROM (VALUES ('Engineering')) AS excluded(dept)
    WHERE excluded.dept = e.department
);

The rule: if you write NOT IN (subquery), you must either prove the subquery can never return NULL (column is NOT NULL in the schema) or rewrite as NOT EXISTS. This bug has taken down production queries for years -- one NULL sneaks into a reference table and a nightly report silently returns zero rows.


BETWEEN -- Inclusive on Both Ends

col BETWEEN a AND b is equivalent to col >= a AND col <= b. Both endpoints are inclusive. People who assume it is half-open (like Python's range) write off-by-one bugs constantly.

-- Salaries between 90k and 130k, inclusive
SELECT name, salary FROM employees
WHERE salary BETWEEN 90000 AND 130000;
-- Output:
--  name              | salary
-- -------------------+--------
--  Grace Hopper      |  95000
--  Margaret Hamilton | 130000   <- endpoint included
--  Alan Turing       | 110000

-- Equivalent expansion:
SELECT name, salary FROM employees
WHERE salary >= 90000 AND salary <= 130000;

-- Date ranges: BE CAREFUL with timestamps near midnight
SELECT name, hired_on FROM employees
WHERE hired_on BETWEEN '2020-01-01' AND '2021-12-31';
-- Output:
--  name              | hired_on
-- -------------------+------------
--  Margaret Hamilton | 2020-01-15
--  Alan Turing       | 2020-07-07
--  Grace Hopper      | 2021-06-01

Warning for TIMESTAMP columns: BETWEEN '2023-01-01' AND '2023-01-31' misses any row from January 31st after 00:00:00. Use hired_on >= '2023-01-01' AND hired_on < '2023-02-01' (half-open) for timestamp ranges. BETWEEN is safer for pure DATE columns.


IS NULL / IS NOT NULL -- The Only Way to Test for NULL

NULL is not a value -- it is a marker meaning "no value," and no comparison operator returns TRUE when either side is NULL. To test whether a column is NULL, you must use the IS NULL or IS NOT NULL predicate, which returns true boolean TRUE/FALSE (never UNKNOWN).

-- WRONG: returns zero rows, not "employees without a department"
SELECT name FROM employees WHERE department = NULL;
-- Output: (empty)
-- Reason: NULL = NULL evaluates to UNKNOWN, which WHERE drops.

-- RIGHT: find employees with no department
SELECT name FROM employees WHERE department IS NULL;
-- Output:
--  name
-- -------------
--  Alan Turing

-- The negation: employees who HAVE a department
SELECT name, department FROM employees WHERE department IS NOT NULL;
-- Output:
--  name              | department
-- -------------------+-------------
--  Ada Lovelace      | Engineering
--  Linus Torvalds    | Engineering
--  Grace Hopper      | Engineering
--  Margaret Hamilton | Research
--  Katherine Johnson | Research
--  Dennis Ritchie    | Engineering

-- Combine with other predicates to handle NULLs explicitly
SELECT name, department FROM employees
WHERE department <> 'Engineering' OR department IS NULL;
-- Output:
--  name              | department
-- -------------------+-------------
--  Margaret Hamilton | Research
--  Katherine Johnson | Research
--  Alan Turing       | (null)

The last query is the idiom for "everyone except engineers," correctly including the row where the department is unknown. Without the explicit OR department IS NULL, three-valued logic would have silently omitted Alan Turing.


Three-Valued Logic -- Truth Tables

SQL's logic is not two-valued (TRUE/FALSE) but three-valued (TRUE/FALSE/UNKNOWN). UNKNOWN is what every comparison with NULL produces, and it propagates through AND, OR, and NOT according to these tables. Memorize them -- interviewers love to ask about them, and they explain every surprising WHERE-clause result.

+---------------------------------------------------------------+
|           AND truth table                                    |
+---------------------------------------------------------------+
|                                                                |
|     AND    | TRUE     | FALSE    | UNKNOWN                     |
|   ---------+----------+----------+----------                   |
|    TRUE    | TRUE     | FALSE    | UNKNOWN                     |
|    FALSE   | FALSE    | FALSE    | FALSE                       |
|    UNKNOWN | UNKNOWN  | FALSE    | UNKNOWN                     |
|                                                                |
|   Rule: FALSE always wins in AND (short-circuit to FALSE).     |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           OR truth table                                     |
+---------------------------------------------------------------+
|                                                                |
|     OR     | TRUE     | FALSE    | UNKNOWN                     |
|   ---------+----------+----------+----------                   |
|    TRUE    | TRUE     | TRUE     | TRUE                        |
|    FALSE   | TRUE     | FALSE    | UNKNOWN                     |
|    UNKNOWN | TRUE     | UNKNOWN  | UNKNOWN                     |
|                                                                |
|   Rule: TRUE always wins in OR (short-circuit to TRUE).        |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           NOT truth table                                    |
+---------------------------------------------------------------+
|                                                                |
|     NOT     | Result                                           |
|   ----------+--------                                          |
|    TRUE     | FALSE                                            |
|    FALSE    | TRUE                                             |
|    UNKNOWN  | UNKNOWN                                          |
|                                                                |
|   Rule: NOT of UNKNOWN is still UNKNOWN -- this is why         |
|   "NOT (col = 5)" does NOT include rows where col IS NULL.     |
|                                                                |
+---------------------------------------------------------------+

The two load-bearing insights: (1) FALSE AND anything is always FALSE, so a missing operand on one side can still short-circuit the whole expression; and (2) NOT UNKNOWN is still UNKNOWN, which is why negating a predicate over a nullable column rarely does what a naive reader expects.


WHERE vs ON in Joins -- A Quick Note

A common interview follow-up: what is the difference between a predicate in the WHERE clause and one in the ON clause of an outer join? For inner joins they are equivalent, but for LEFT JOIN / RIGHT JOIN they behave very differently. A predicate in ON is applied before the outer join fills in NULLs for the non-matching side, so non-matches are preserved. A predicate in WHERE is applied after the outer join, so filtering on the nullable side of a LEFT JOIN in WHERE silently converts it into an inner join. The short version: "filters on the outer side belong in ON; filters on the driving side belong in WHERE." This is covered in depth in Chapter 6 (Joins), but it is worth knowing now -- interviewers love to combine it with a three-valued-logic question.


Common Mistakes

1. Writing col = NULL instead of col IS NULL. This is the single most common SQL bug. col = NULL always evaluates to UNKNOWN, so the WHERE clause returns zero rows regardless of whether the column contains NULL. The only way to test for the absence of a value is IS NULL (or IS NOT NULL for the negation). Some dialects offer IS DISTINCT FROM / IS NOT DISTINCT FROM as a NULL-safe equality operator, but the classic form is IS NULL. If you ever see WHERE x = NULL in a code review, it is almost certainly a bug.

2. Forgetting that AND binds tighter than OR. WHERE a = 1 OR a = 2 AND b = 3 parses as a = 1 OR (a = 2 AND b = 3), not (a = 1 OR a = 2) AND b = 3. The buggy version returns every row where a = 1, regardless of b. Always wrap OR groups in parentheses when mixing connectives -- even if precedence technically gives the right answer, explicit grouping survives edits and code reviews. Treat parentheses in WHERE clauses like semicolons in JavaScript: always write them.

3. NOT IN (subquery) when the subquery can return NULL. NOT IN expands to a chain of <> comparisons joined by AND. If any value in the list is NULL, one of those comparisons becomes UNKNOWN, the whole expression is at best UNKNOWN, and zero rows match. The fix is either to guarantee the subquery's column is NOT NULL in the schema, filter NULLs out of the subquery with a WHERE ... IS NOT NULL, or rewrite the predicate using NOT EXISTS, which does not suffer from this problem because it works row-by-row rather than expanding to AND-ed comparisons.

4. Assuming BETWEEN is half-open (exclusive on the upper end). BETWEEN a AND b is inclusive on both ends -- it is col >= a AND col <= b. Developers coming from Python (range(a, b) is half-open) or C (for (i = a; i < b; i++)) routinely assume the upper bound is excluded, and write BETWEEN 0 AND 100 when they mean "0 to 99." The fix is to know the rule. For timestamp ranges, prefer the explicit half-open form ts >= '2023-01-01' AND ts < '2023-02-01' so you do not miss rows from the last day of the range that happen to be after 00:00:00.

5. Case sensitivity in string comparisons. In PostgreSQL, TEXT and VARCHAR comparisons are case-sensitive by default, so WHERE department = 'engineering' will not match 'Engineering'. Developers who test with capitalized data and deploy against lowercased production data get burned routinely. The fixes: normalize at write time (store everything lowercase, uppercase at display), normalize at read time (WHERE LOWER(department) = 'engineering' -- but beware, this defeats most indexes), or use case-insensitive types (CITEXT in PostgreSQL, or COLLATE clauses). MySQL's default collation is case-insensitive, which surprises developers coming the other way. Know your engine.


Interview Questions

1. "Why does SELECT * FROM t WHERE x <> 5 not return rows where x IS NULL, even though NULL is clearly not equal to 5?"

Because NULL in SQL does not mean "a specific value we don't know yet" -- it means "no value," and any comparison operator with a NULL operand returns UNKNOWN, not TRUE or FALSE. When the database evaluates x <> 5 on a row where x IS NULL, it computes NULL <> 5, which is UNKNOWN. The WHERE clause keeps only rows where the predicate is TRUE, so it drops both FALSE rows and UNKNOWN rows. This is three-valued logic in action. The fix is to write WHERE x <> 5 OR x IS NULL if you want to include rows where the column is unknown, or WHERE x IS DISTINCT FROM 5 in PostgreSQL, which treats NULL and 5 as distinct and returns true-boolean TRUE/FALSE without the UNKNOWN middle state.

2. "What is wrong with WHERE dept = 'Eng' OR dept = 'Res' AND salary > 100000? How would you fix it?"

Operator precedence. AND binds tighter than OR, so this parses as WHERE dept = 'Eng' OR (dept = 'Res' AND salary > 100000). That returns every engineering employee regardless of salary, plus research employees earning over 100k. If the intent was "engineering or research employees earning over 100k," the correct form is WHERE (dept = 'Eng' OR dept = 'Res') AND salary > 100000, or more idiomatically WHERE dept IN ('Eng', 'Res') AND salary > 100000. The root cause of this bug is missing parentheses; the mitigation is a team rule that any WHERE clause mixing AND and OR must explicitly parenthesize the OR group, regardless of whether precedence would give the same answer.

3. "Explain the NOT IN + NULL ambush. Why does WHERE dept NOT IN (SELECT dept FROM inactive_departments) sometimes return zero rows?"

NOT IN is logically equivalent to a chain of <> comparisons joined by AND. If the subquery returns, say, ('Finance', 'HR', NULL), the predicate expands to dept <> 'Finance' AND dept <> 'HR' AND dept <> NULL. The third comparison is always UNKNOWN, because nothing equals or does-not-equal NULL under three-valued logic. Since X AND UNKNOWN is at best UNKNOWN (never TRUE), no row can ever satisfy the predicate and the query returns zero rows. This is a silent, data-dependent bug -- the query works fine until someone inserts a NULL into the reference table, at which point a nightly report silently empties out. The robust fix is to use NOT EXISTS, which walks the subquery row by row and handles NULL correctly. Alternatively, filter NULLs out of the subquery explicitly, or enforce NOT NULL at the schema level so the bug cannot occur.

4. "What is the difference between BETWEEN a AND b and a pair of explicit comparisons, and when would you prefer one over the other?"

col BETWEEN a AND b is exactly equivalent to col >= a AND col <= b -- inclusive on both ends, with the same execution plan on every mainstream database. There is no performance difference. The readability argument favors BETWEEN for pure ranges on DATE or numeric columns because it reads naturally ("salary between 90k and 130k"). The argument against BETWEEN is that it tempts people to use it on TIMESTAMP columns, where BETWEEN '2023-01-01' AND '2023-01-31' silently drops any row from January 31st after midnight -- half a day of data gone. For timestamps, I always write the explicit half-open form ts >= '2023-01-01' AND ts < '2023-02-01', because "start of month inclusive, start of next month exclusive" is the canonical way to describe a calendar month and it handles all 28/29/30/31-day months uniformly.

5. "Walk me through three-valued logic. Why does SQL have UNKNOWN as a third truth value, and how does it affect the WHERE clause?"

SQL columns can be NULL, which represents the absence of a value rather than a specific value. To make comparisons with NULL well-defined, SQL extends boolean logic with a third truth value, UNKNOWN. Any comparison involving a NULL operand returns UNKNOWN: NULL = 5 is UNKNOWN, NULL <> 5 is UNKNOWN, NULL = NULL is UNKNOWN. The connectives AND, OR, NOT are extended accordingly -- TRUE AND UNKNOWN is UNKNOWN, TRUE OR UNKNOWN is TRUE (because TRUE short-circuits), FALSE AND UNKNOWN is FALSE (because FALSE short-circuits), and NOT UNKNOWN is UNKNOWN. The WHERE clause filters on TRUE only, so both FALSE rows and UNKNOWN rows are dropped. This asymmetry -- WHERE treating UNKNOWN like FALSE even though they are logically distinct -- is the source of almost every NULL-related SQL bug. The practical consequence is that negating a predicate over a nullable column (NOT (x = 5), x <> 5, x NOT IN (...)) does not include NULL rows; you must explicitly add OR x IS NULL if you want them. The alternative to three-valued logic would be to forbid NULL entirely (as Codd's original relational model preferred), but every major SQL engine ships three-valued logic, so you need to master it.


Quick Reference -- WHERE Clause Cheat Sheet

+---------------------------------------------------------------+
|           COMPARISON OPERATORS                                |
+---------------------------------------------------------------+
|                                                                |
|   =       equal                                                |
|   <>      not equal (SQL standard)                             |
|   !=      not equal (synonym, widely supported)                |
|   <       less than                                            |
|   >       greater than                                         |
|   <=      less than or equal                                   |
|   >=      greater than or equal                                |
|                                                                |
|   ANY comparison with NULL -> UNKNOWN -> dropped by WHERE      |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           BOOLEAN CONNECTIVES                                 |
+---------------------------------------------------------------+
|                                                                |
|   NOT   (highest precedence)                                   |
|   AND                                                          |
|   OR    (lowest precedence)                                    |
|                                                                |
|   Rule: always parenthesize OR groups when mixing with AND.    |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           SET AND RANGE PREDICATES                            |
+---------------------------------------------------------------+
|                                                                |
|   col IN (v1, v2, v3)          -- sugar for chained OR         |
|   col NOT IN (v1, v2, v3)      -- beware NULLs in the list     |
|   col BETWEEN a AND b          -- inclusive BOTH ends          |
|   col NOT BETWEEN a AND b      -- NOT (col BETWEEN a AND b)    |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           NULL PREDICATES                                     |
+---------------------------------------------------------------+
|                                                                |
|   col IS NULL                  -- true boolean TRUE/FALSE     |
|   col IS NOT NULL              -- true boolean TRUE/FALSE     |
|   col IS DISTINCT FROM x       -- NULL-safe inequality (PG)   |
|   col IS NOT DISTINCT FROM x   -- NULL-safe equality   (PG)   |
|                                                                |
|   NEVER use col = NULL -- always UNKNOWN, always dropped.      |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           THREE-VALUED LOGIC -- CONDENSED                     |
+---------------------------------------------------------------+
|                                                                |
|   AND | T F U        OR  | T F U        NOT | result          |
|   ----+------        ----+------        ----+-------          |
|    T  | T F U         T  | T T T         T  | F               |
|    F  | F F F         F  | T F U         F  | T               |
|    U  | U F U         U  | T U U         U  | U               |
|                                                                |
|   FALSE dominates AND.  TRUE dominates OR.  NOT U = U.         |
|                                                                |
+---------------------------------------------------------------+
ConcernWrong WayRight Way
Testing for NULLcol = NULLcol IS NULL
Mixing AND/ORa OR b AND c(a OR b) AND c
Many equality checksLong OR chaincol IN (v1, v2, v3)
NOT IN (subquery)Hope the subquery has no NULLsNOT EXISTS (...) or filter IS NOT NULL
Date range on TIMESTAMPBETWEEN '2023-01-01' AND '...31'ts >= '2023-01-01' AND ts < '2023-02-01'
"Everything except X"col <> 'X'col <> 'X' OR col IS NULL
Case-insensitive matchcol = 'Eng' (wrong data)LOWER(col) = 'eng' or CITEXT
Filter on outer join sidePredicate in WHEREPredicate in ON (keeps non-matches)

Prev: Lesson 3.4 -- Delete Data Next: Lesson 4.2 -- Pattern Matching LIKE


This is Lesson 4.1 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

On this page