LinkedIn Hook

"Your application made 50,000 round trips to the database to settle one batch of invoices. The same job, rewritten as a stored procedure, finished in 800 ms with a single call."

Stored procedures and functions are the most divisive feature in the SQL world. One camp swears by them — pushing logic into the database means fewer network round trips, transactional safety, and a single place to enforce business rules. The other camp warns that they trap business logic inside a vendor-specific language with no version control, no unit tests, and no observability. Both camps are right, and both are wrong, depending on the workload.

The confusion gets worse because the words "function" and "procedure" mean different things in different databases. In Postgres before version 11, there were only functions, and people wrote functions that did transactional work even though that was technically the wrong tool. In Postgres 11+, real CREATE PROCEDURE arrived with the ability to COMMIT and ROLLBACK mid-body — something a function still cannot do. MySQL has had both since 5.0, but with completely different syntax and a much weaker procedural language.

Then there is PL/pgSQL itself: a block-structured language with DECLARE, BEGIN, EXCEPTION, and END, loops, cursors, dynamic SQL, and exception handling. It looks like Ada by way of Oracle PL/SQL, and the first time you write one you will feel like you have time-traveled to 1995. But for the right job — bulk data movement, complex triggers, set-based ETL — nothing beats it.

In Lesson 12.2, I break down stored procedures and functions in PostgreSQL: CREATE FUNCTION vs CREATE PROCEDURE, parameter modes, RETURNS TABLE, transaction control, trigger functions, and the honest tradeoffs versus pushing logic to the application layer.

Read the full lesson -> [link]

#SQL #PostgreSQL #PLpgSQL #StoredProcedures #Database #BackendDevelopment #InterviewPrep

Stored Procedures and Functions thumbnail

What You'll Learn

The difference between a function and a procedure in PostgreSQL 11+ and why both exist
PL/pgSQL block structure: DECLARE, BEGIN, EXCEPTION, END
Parameter modes (IN, OUT, INOUT, VARIADIC) and how they shape the call signature
RETURNS scalar vs RETURNS TABLE vs RETURNS SETOF and when each fits
Why functions cannot COMMIT but procedures can — and what that unlocks
How trigger functions hook into BEFORE/AFTER INSERT/UPDATE/DELETE events
Honest pros and cons of database-side logic versus application-layer logic
How PostgreSQL stored routines differ from MySQL's

The Restaurant Kitchen Analogy — Cooking at the Table vs in the Kitchen

Imagine a restaurant where the customer orders a steak. There are two ways to prepare it. In the first, the waiter brings raw ingredients to the table, the customer cuts the meat, the waiter walks back for salt, walks back for the pan, walks back for the oil, then carries each step out to the kitchen for cooking. Every back-and-forth is a network trip. The customer waits, the waiter is exhausted, and the kitchen is barely involved.

In the second, the customer says "I want the steak medium rare with herb butter." The order goes to the kitchen on a single ticket. The chef — who already has every ingredient, every tool, and every pan within arm's reach — assembles the dish in seconds and sends it back finished. One trip out, one plate back. The customer never sees the work.

That is the difference between application-side logic and stored procedures. Your application is the customer at the table; the database is the kitchen. Every SQL query is a trip the waiter has to make. If your business logic needs to read 50,000 rows, transform them, and write them back, doing it row-by-row from the application means 50,000+ trips — each with network latency, query parsing, and result marshaling. Doing it inside a stored procedure means one call, and the database does all the work where the data already lives.

+---------------------------------------------------------------+
|           APPLICATION-SIDE LOOP (slow)                        |
+---------------------------------------------------------------+
|                                                                |
|  for invoice in invoices:                                      |
|      row  = db.query("SELECT ... WHERE id = ?", invoice.id)    |
|      calc = compute_total(row)                                 |
|      db.query("UPDATE ... SET total = ? WHERE id = ?", calc)   |
|                                                                |
|  Cost: 2 * N round trips (N = 50,000)                          |
|        ~ 50,000 * 2 ms latency = ~100 seconds                  |
|        Plus parse + plan + marshal per call                    |
|                                                                |
+---------------------------------------------------------------+

+---------------------------------------------------------------+
|           STORED PROCEDURE (fast)                             |
+---------------------------------------------------------------+
|                                                                |
|  CALL settle_invoices(p_batch_id => 42);                       |
|                                                                |
|  Cost: 1 round trip                                            |
|        Set-based UPDATE inside the procedure                   |
|        ~ 800 ms total for the same 50,000 rows                 |
|        Data never crosses the network                          |
|                                                                |
+---------------------------------------------------------------+

Napkin AI Visual Prompt: "Dark gradient (#0a0f1f -> #111a2e). Split comparison: LEFT side labeled 'App Loop' shows a stick-figure waiter sprinting between a customer table and a kitchen 50,000 times, with red latency arrows. RIGHT side labeled 'Stored Procedure' shows a single sky blue (#4fc3f7) ticket flying once into the kitchen, with a finished plate flying back. White monospace labels. Rose (#ff5c8a) timing badges: '100 s' on left, '800 ms' on right."

PL/pgSQL — The Procedural Language Inside Postgres

PL/pgSQL is the default procedural language for PostgreSQL. It is enabled out of the box (you do not need CREATE EXTENSION), it is compiled to an internal tree on first use of each session, and it lives entirely inside the database. You write it, the database stores it, and clients call it by name.

The bones of every PL/pgSQL routine are the same: an optional DECLARE for local variables, a mandatory BEGIN ... END block for the body, and an optional EXCEPTION clause that catches errors raised inside the block.

-- Skeleton of every PL/pgSQL block
DO $$
DECLARE
    -- Local variables and their types
    v_count   integer := 0;
    v_message text;
BEGIN
    -- Statements: SQL, control flow, assignments
    SELECT count(*) INTO v_count FROM users;
    v_message := 'There are ' || v_count || ' users';
    RAISE NOTICE '%', v_message;

EXCEPTION
    -- Optional error handler — catches any error raised inside BEGIN
    WHEN others THEN
        RAISE NOTICE 'Something went wrong: %', SQLERRM;
END;
$$ LANGUAGE plpgsql;

Sample output:

NOTICE:  There are 1284 users
DO

DO runs an anonymous block — useful for one-off scripts. The same body wrapped in CREATE FUNCTION or CREATE PROCEDURE becomes a stored, named, callable routine.

CREATE FUNCTION — Returning Values, Used in SELECT

A function in PostgreSQL is a routine that returns a value (scalar, row, set of rows, or table) and can be called inside a SELECT, WHERE, FROM, or any other expression context. Functions cannot manage transactions — they always run inside the caller's transaction. This is the most important distinction from procedures.

Example 1 — Scalar Function

-- A scalar function: takes a price and a tax rate, returns the gross price.
-- IMMUTABLE marks it as a pure function (same input -> same output, no side
-- effects). This lets the planner cache and inline calls in expressions.
CREATE OR REPLACE FUNCTION gross_price(
    p_net  numeric,
    p_rate numeric DEFAULT 0.20
)
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $$
BEGIN
    -- Round to 2 decimals so cents do not leak through
    RETURN round(p_net * (1 + p_rate), 2);
END;
$$;

-- Use it in a SELECT like any built-in function
SELECT
    product_id,
    name,
    net_price,
    gross_price(net_price, 0.21) AS gross_eur
FROM products
ORDER BY product_id
LIMIT 3;

Sample output:

 product_id |   name    | net_price | gross_eur
------------+-----------+-----------+-----------
          1 | Notebook  |     12.50 |     15.13
          2 | Pen       |      1.99 |      2.41
          3 | Stapler   |      8.75 |     10.59
(3 rows)

The function body is a single expression, but the value of the function is everywhere: filters, ORDER BY clauses, computed columns, and join conditions. Because it is IMMUTABLE, Postgres can use it in expression indexes and inline it during planning.

Example 2 — RETURNS TABLE

A function that returns a table acts like a parameterized view. It is the cleanest way to package a complex, reusable query.

-- Find the top N customers by total spend in a date range.
-- RETURNS TABLE(...) declares the output columns and types.
CREATE OR REPLACE FUNCTION top_customers(
    p_from   date,
    p_to     date,
    p_limit  integer DEFAULT 10
)
RETURNS TABLE (
    customer_id   bigint,
    customer_name text,
    order_count   integer,
    total_spend   numeric
)
LANGUAGE plpgsql
STABLE
AS $$
BEGIN
    -- RETURN QUERY streams the rows of an arbitrary SELECT as the
    -- function's result set. No intermediate buffering needed.
    RETURN QUERY
    SELECT
        c.id            AS customer_id,
        c.full_name     AS customer_name,
        count(o.id)::int AS order_count,
        sum(o.total)    AS total_spend
    FROM customers c
    JOIN orders    o ON o.customer_id = c.id
    WHERE o.placed_at >= p_from
      AND o.placed_at <  p_to
    GROUP BY c.id, c.full_name
    ORDER BY total_spend DESC
    LIMIT p_limit;
END;
$$;

-- Call it from a regular SELECT
SELECT *
FROM top_customers('2026-01-01', '2026-04-01', 5);

Sample output:

 customer_id | customer_name  | order_count | total_spend
-------------+----------------+-------------+-------------
        1042 | Aisha Khan     |          27 |     8421.50
         318 | Marco Rossi    |          19 |     6710.00
         907 | Linh Tran      |          22 |     6204.75
         255 | Diego Alvarez  |          14 |     5980.20
         611 | Sara Lindqvist |          18 |     5511.40
(5 rows)

STABLE tells the planner that within a single statement the function will return the same result for the same input — so it can be evaluated once per scan instead of per row. Use STABLE for read-only functions, IMMUTABLE for pure expressions, and VOLATILE (the default) for anything that writes or depends on time/random/sequence values.

Parameter Modes

Functions support four parameter modes:

+---------------------------------------------------------------+
|           PL/pgSQL PARAMETER MODES                            |
+---------------------------------------------------------------+
|                                                                |
|  IN        -> input only (default), read inside body           |
|  OUT       -> output only, becomes part of the return record   |
|  INOUT     -> input AND output                                 |
|  VARIADIC  -> last param, accepts a variable-length array      |
|                                                                |
|  Multiple OUT parameters are equivalent to RETURNS TABLE,      |
|  except the function is called with no return-shape clause.    |
|                                                                |
+---------------------------------------------------------------+

-- Multiple OUT parameters: a function that returns several values
-- without needing a composite type or RETURNS TABLE.
CREATE OR REPLACE FUNCTION order_summary(
    IN  p_order_id bigint,
    OUT total      numeric,
    OUT item_count integer,
    OUT status     text
)
LANGUAGE plpgsql
STABLE
AS $$
BEGIN
    -- Single SELECT populates all OUT params in one shot
    SELECT o.total, count(li.*)::int, o.status
      INTO total, item_count, status
      FROM orders o
      LEFT JOIN line_items li ON li.order_id = o.id
     WHERE o.id = p_order_id
     GROUP BY o.total, o.status;
END;
$$;

SELECT * FROM order_summary(8810);

Sample output:

 total  | item_count |  status
--------+------------+----------
 142.99 |          4 | shipped
(1 row)

CREATE PROCEDURE — Transaction Control, Called with CALL

PostgreSQL 11 introduced real procedures. The defining difference from a function is simple: a procedure can run COMMIT and ROLLBACK inside its body. A function cannot — it is glued to the transaction of whoever called it.

This unlocks a class of jobs that were previously impossible to express purely inside the database: long-running batch jobs that need to commit progress periodically so they do not hold a single giant transaction open for hours, locking rows and bloating the WAL.

-- A batch settlement procedure.
-- It walks invoices in chunks of 1,000, settling each chunk in its own
-- transaction. If chunk #57 fails, chunks 1..56 are already committed.
CREATE OR REPLACE PROCEDURE settle_invoices(p_batch_id bigint)
LANGUAGE plpgsql
AS $$
DECLARE
    v_chunk_size constant integer := 1000;
    v_processed  integer := 0;
    v_chunk      integer;
BEGIN
    LOOP
        -- Settle up to 1,000 invoices in this iteration
        WITH next_chunk AS (
            SELECT id
              FROM invoices
             WHERE batch_id = p_batch_id
               AND settled_at IS NULL
             ORDER BY id
             LIMIT v_chunk_size
             FOR UPDATE SKIP LOCKED
        )
        UPDATE invoices i
           SET settled_at = now(),
               total      = gross_price(i.net_total, i.tax_rate)
          FROM next_chunk nc
         WHERE i.id = nc.id;

        GET DIAGNOSTICS v_chunk = ROW_COUNT;
        EXIT WHEN v_chunk = 0;

        v_processed := v_processed + v_chunk;

        -- Commit the chunk so locks are released and the WAL flushes.
        -- This is illegal inside a function — only procedures can do it.
        COMMIT;

        RAISE NOTICE 'Settled chunk: % (running total: %)',
            v_chunk, v_processed;
    END LOOP;

    RAISE NOTICE 'Done. Total settled: %', v_processed;
END;
$$;

-- Procedures are invoked with CALL, not SELECT
CALL settle_invoices(42);

Sample output:

NOTICE:  Settled chunk: 1000 (running total: 1000)
NOTICE:  Settled chunk: 1000 (running total: 2000)
NOTICE:  Settled chunk: 1000 (running total: 3000)
NOTICE:  Settled chunk: 487  (running total: 3487)
NOTICE:  Done. Total settled: 3487
CALL

A few subtleties about transaction control inside procedures:

COMMIT and ROLLBACK are only legal at the top level of the procedure body. You cannot commit inside a BEGIN ... EXCEPTION ... END block, because exception handlers implicitly create a savepoint.
A procedure called from inside another procedure (or from a function) inherits the outer transaction and cannot commit on its own.
Procedures called from a normal CALL outside any transaction get to drive their own commits. Inside a BEGIN ... CALL ... COMMIT; block, the procedure cannot commit.

Trigger Functions — Hooking Into Row Changes

A trigger function is a normal PL/pgSQL function with RETURNS trigger and special variables (NEW, OLD, TG_OP, TG_TABLE_NAME) that PostgreSQL populates for each row event. Triggers attach a function to a table on INSERT, UPDATE, or DELETE, either BEFORE or AFTER the row is written.

-- Audit table that records every change to the products table
CREATE TABLE product_audit (
    id            bigserial PRIMARY KEY,
    product_id    bigint    NOT NULL,
    operation     text      NOT NULL,
    changed_by    text      NOT NULL,
    changed_at    timestamptz NOT NULL DEFAULT now(),
    old_row       jsonb,
    new_row       jsonb
);

-- The trigger function: runs once per row, sees NEW and OLD
CREATE OR REPLACE FUNCTION audit_product_changes()
RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
    -- TG_OP is one of 'INSERT', 'UPDATE', 'DELETE'
    INSERT INTO product_audit(product_id, operation, changed_by, old_row, new_row)
    VALUES (
        COALESCE(NEW.id, OLD.id),
        TG_OP,
        current_user,
        CASE WHEN TG_OP IN ('UPDATE', 'DELETE') THEN to_jsonb(OLD) END,
        CASE WHEN TG_OP IN ('INSERT', 'UPDATE') THEN to_jsonb(NEW) END
    );

    -- AFTER triggers can return NULL (the value is ignored).
    -- BEFORE triggers must return NEW (or a modified copy) to allow
    -- the write, or NULL to silently skip it.
    RETURN NULL;
END;
$$;

-- Wire the function up to the table
CREATE TRIGGER trg_audit_products
AFTER INSERT OR UPDATE OR DELETE ON products
FOR EACH ROW
EXECUTE FUNCTION audit_product_changes();

-- Try it
UPDATE products SET net_price = 13.49 WHERE id = 1;

SELECT operation, changed_by, old_row->>'net_price' AS old_price,
       new_row->>'net_price' AS new_price
  FROM product_audit
 WHERE product_id = 1
 ORDER BY id DESC
 LIMIT 1;

Sample output:

 operation | changed_by | old_price | new_price
-----------+------------+-----------+-----------
 UPDATE    | app_user   | 12.50     | 13.49
(1 row)

A few rules about trigger functions worth memorizing:

BEFORE triggers can mutate NEW to rewrite the row before it is stored. Useful for normalizing emails, computing derived columns, or enforcing invariants.
AFTER triggers see the final row but cannot change it. Use them for auditing, cascading writes, and notifications.
FOR EACH ROW fires once per affected row; FOR EACH STATEMENT fires once per SQL statement and does not see NEW/OLD (use the transition tables NEW TABLE AS new_rows instead).
Triggers run inside the same transaction as the statement that fired them. If the trigger raises an error, the whole statement rolls back.

Functions vs Procedures — The Defining Differences

+---------------------------------------------------------------+
|           FUNCTION vs PROCEDURE (PostgreSQL 11+)              |
+---------------------------------------------------------------+
|                                                                |
|  FUNCTION                                                      |
|   - Returns a value (scalar, table, set, void)                 |
|   - Called inside SELECT / WHERE / FROM expressions            |
|   - CANNOT run COMMIT / ROLLBACK                               |
|   - Volatility hints: IMMUTABLE / STABLE / VOLATILE            |
|   - Can be inlined by the planner (when SQL language)          |
|                                                                |
|  PROCEDURE                                                     |
|   - Returns nothing (use OUT params for outputs)               |
|   - Called with CALL, NOT inside SELECT                        |
|   - CAN run COMMIT and ROLLBACK in its body                    |
|   - No volatility classification                               |
|   - Designed for batch jobs and long-running ETL               |
|                                                                |
|  RULE: if you need transaction control, use a procedure.       |
|        Otherwise prefer a function — it composes in queries.   |
|                                                                |
+---------------------------------------------------------------+

The Honest Tradeoff — Database Logic vs Application Logic

This is the question every backend team eventually argues about. There is no universal answer, but the tradeoffs are well understood.

Pros of stored procedures and functions:

Network round trips collapse. A batch job that needs to read, transform, and write thousands of rows runs once instead of N times.
Set-based logic is faster. SQL is built for set operations; doing them inside the database avoids marshaling rows out to your application and back.
Single source of truth. Business rules implemented in the database apply to every client — Python services, Node workers, ad-hoc psql sessions, BI dashboards. No client can bypass the rule by calling raw SQL.
Transactional safety. Logic that lives next to the data can guarantee atomicity without distributed transaction protocols.
Permission boundaries. You can grant EXECUTE on a procedure without granting SELECT/UPDATE on the underlying tables. The procedure becomes the only legal interface.

Cons:

Vendor lock-in. PL/pgSQL does not run on MySQL. Oracle PL/SQL does not run on Postgres. Migrating engines means rewriting every routine.
Poor tooling. Version control, code review, unit testing, debugging, and observability are all weaker than for application code. Tools exist (pgTAP for tests, plpgsql_check for linting) but the developer experience is decades behind modern languages.
Hidden behavior. A trigger silently rewriting rows on INSERT is invisible to anyone reading the application code. New engineers spend hours hunting "magic" data changes.
Deploy coupling. Schema migrations and code migrations become the same migration. Rolling back a buggy procedure requires a database deploy, not a container restart.
Scaling. Application servers are easy to scale horizontally; the database is hard. Pushing CPU-heavy logic into the database concentrates load on your most expensive, least replicable tier.

The reasonable middle ground: keep request/response business logic in the application layer where it is testable, observable, and easy to refactor. Reach for stored procedures and functions for (1) bulk data jobs where round trips dominate cost, (2) invariants that must hold no matter who writes to the table (use CHECK constraints first, triggers second), and (3) complex queries that deserve to be packaged as RETURNS TABLE functions for reuse.

Note on MySQL

MySQL has had stored procedures and functions since 5.0, but the language is far less powerful than PL/pgSQL. Key differences:

MySQL uses DELIMITER // syntax to define multi-statement routines because the default ; ends the routine prematurely.
MySQL functions cannot easily return result sets; you need a procedure with SELECT to stream rows back.
MySQL has no RETURNS TABLE — return composite results via OUT params or via a procedure that just runs a SELECT.
Triggers cannot call other triggers, and there is no INSTEAD OF trigger on views.
No transaction control inside procedures — COMMIT and ROLLBACK are blocked the same way they are blocked inside Postgres functions.

If portability matters, write business logic in the application and keep database routines to the bare minimum.

Common Mistakes

1. Putting transactional logic in a function and then wondering why COMMIT is rejected. PL/pgSQL functions run inside the caller's transaction and cannot commit. If you need to commit between iterations of a loop, you need a PROCEDURE, not a FUNCTION. The error message — "invalid transaction termination" — does not make this obvious. Convert the routine to a procedure and call it with CALL instead of SELECT.

2. Using VOLATILE (the default) for read-only functions. If you forget to mark a read-only function STABLE or IMMUTABLE, the planner re-evaluates it for every row in a scan and refuses to use expression indexes built on it. A function over a million-row table can run a million times instead of once. Always classify volatility deliberately.

3. Writing row-by-row PL/pgSQL when a single set-based SQL would do. The most common antipattern is opening a cursor, looping over rows, and calling UPDATE ... WHERE id = current_row.id inside the loop. This loses every advantage of being inside the database. Prefer a single UPDATE ... FROM or INSERT ... SELECT that processes the whole set in one statement.

4. Trigger functions with side effects on other tables, with no documentation. A trigger that quietly inserts into an audit table is fine. A trigger that calls a webhook, updates three other tables, and writes to a queue is a debugging nightmare. Document every trigger in the table's CREATE statement comment, and resist the urge to chain triggers across many tables.

5. Catching WHEN others and swallowing errors. EXCEPTION WHEN others THEN NULL; looks like defensive coding but it hides every error — including ones you would absolutely want to know about, like constraint violations and lock timeouts. Catch only the specific exception classes you can actually handle (e.g. unique_violation), and re-raise everything else with RAISE.

Interview Questions

1. "What is the difference between a function and a procedure in PostgreSQL 11+, and when would you use each?"

A function returns a value (scalar, set, or table) and is called inside an expression context — typically a SELECT, WHERE, or FROM clause. It runs entirely inside the caller's transaction and cannot issue COMMIT or ROLLBACK. A procedure, introduced in Postgres 11, is called with CALL, returns nothing (or returns through OUT parameters), and can commit and roll back inside its own body. Use a function whenever you need a reusable expression — a parameterized view via RETURNS TABLE, a computed column, a domain calculation. Use a procedure for batch jobs that need to chunk work into multiple committed transactions, like settling thousands of invoices in groups of 1,000 so that a failure in batch 57 does not roll back batches 1 through 56. Functions also support volatility hints (IMMUTABLE/STABLE/VOLATILE) that let the planner optimize their use; procedures do not, because they are not meant to participate in query plans.

2. "Walk me through the PL/pgSQL block structure and explain what DECLARE, BEGIN, EXCEPTION, and END do."

A PL/pgSQL block has up to four parts. DECLARE is optional and lists local variables with their types and optional default values. BEGIN ... END is the mandatory body that contains executable statements — SQL, control flow (IF, LOOP, FOR), assignments, and RAISE. EXCEPTION is an optional clause inside the block that catches errors raised in the body; it works like a try/catch, with one or more WHEN <condition> arms and special variables SQLSTATE and SQLERRM for the error code and message. END closes the block. Critically, a block with an EXCEPTION clause implicitly creates a savepoint at BEGIN, so a caught exception rolls back changes made inside the block but leaves the surrounding transaction alive. This is also why you cannot run COMMIT inside an EXCEPTION block — the savepoint machinery would be inconsistent.

3. "When would you use RETURNS TABLE versus multiple OUT parameters versus RETURNS SETOF some_type?"

All three return more than a single scalar, but they fit different needs. RETURNS TABLE(col1 type1, col2 type2, ...) declares an ad-hoc row shape inline and is the cleanest choice when the function is essentially a parameterized query — the caller selects from it like a view: SELECT * FROM my_func(...). Multiple OUT parameters return exactly one record (or zero) and are best when the function is called for a single key and you want named outputs without defining a composite type. RETURNS SETOF some_existing_type is the right choice when the rows you return match a table or a domain composite type that already exists — it avoids redeclaring the column list. Internally RETURNS TABLE is just sugar for OUT parameters plus SETOF, but it documents intent more clearly and is the most readable form for new code.

4. "What is the case for keeping business logic in the application layer instead of in stored procedures?"

The strongest arguments are tooling, observability, and scaling. Application code lives in version control, has unit tests, has type checkers, has profilers, has APMs, and can be deployed independently of the database. Stored procedures have weaker versions of all of these — pgTAP for tests, plpgsql_check for static analysis, manual RAISE NOTICE for tracing — and a deploy of a procedure is a database migration, not a container restart, which means rollback is harder and slower. There is also a scaling argument: application servers are cheap and horizontal, while the database is the most expensive and least replicable tier in your stack. Pushing CPU-heavy work into stored procedures concentrates load on the one machine you can least afford to overload. The honest counterpoint is that bulk data jobs, invariants that must hold for all clients, and transactional sequences that demand atomicity are still best served inside the database. The pragmatic rule is "thin database for OLTP request paths, thick database for batch jobs and integrity rules."

5. "Explain how trigger functions work, the difference between BEFORE and AFTER, and when you would choose each."

A trigger function is a regular PL/pgSQL function with RETURNS trigger and access to special variables: NEW (the proposed row for INSERT/UPDATE), OLD (the existing row for UPDATE/DELETE), TG_OP (the operation), and TG_TABLE_NAME. You attach the function to a table with CREATE TRIGGER ... BEFORE|AFTER INSERT|UPDATE|DELETE ON table FOR EACH ROW EXECUTE FUNCTION. A BEFORE trigger runs before the row is written and can mutate NEW to rewrite the row, return NEW unchanged to allow the write, or return NULL to silently skip it — making it the right place for normalization, default computation, and validation. An AFTER trigger runs after the row is written and cannot change it, but it sees the final state of the row and is the right place for auditing, cascading inserts into other tables, and pg_notify events. Both fire inside the same transaction as the triggering statement, so an exception inside the trigger rolls back the whole statement. For statement-level triggers you use FOR EACH STATEMENT and access transition tables OLD TABLE/NEW TABLE instead of OLD/NEW.

Quick Reference — Cheat Sheet

+---------------------------------------------------------------+
|           STORED ROUTINES CHEAT SHEET                         |
+---------------------------------------------------------------+
|                                                                |
|  CREATE FUNCTION                                               |
|    CREATE OR REPLACE FUNCTION name(args)                       |
|    RETURNS type | TABLE(...) | SETOF type | trigger            |
|    LANGUAGE plpgsql                                            |
|    [ IMMUTABLE | STABLE | VOLATILE ]                           |
|    AS $$ ... $$;                                               |
|    Call: SELECT name(args);                                    |
|                                                                |
|  CREATE PROCEDURE                                              |
|    CREATE OR REPLACE PROCEDURE name(args)                      |
|    LANGUAGE plpgsql                                            |
|    AS $$ ... COMMIT; ... ROLLBACK; ... $$;                     |
|    Call: CALL name(args);                                      |
|                                                                |
|  TRIGGER                                                       |
|    CREATE TRIGGER trg_name                                     |
|    BEFORE|AFTER INSERT|UPDATE|DELETE ON table                  |
|    FOR EACH ROW|STATEMENT                                      |
|    EXECUTE FUNCTION trigger_func();                            |
|                                                                |
+---------------------------------------------------------------+

Feature	FUNCTION	PROCEDURE
Returns a value	Yes (scalar / table / set / void)	No (use `OUT` params)
Called via	`SELECT` / expression	`CALL`
`COMMIT` / `ROLLBACK` inside body	No	Yes (Postgres 11+)
Volatility hints	`IMMUTABLE` / `STABLE` / `VOLATILE`	None
Usable in `WHERE` / `FROM`	Yes	No
Usable as a trigger	Yes (`RETURNS trigger`)	No
Inlinable by planner	Yes (SQL-language functions)	No
Best for	Reusable expressions, parameterized queries, triggers	Long-running batch jobs, chunked ETL
MySQL equivalent	`CREATE FUNCTION` (weaker)	`CREATE PROCEDURE` (no `COMMIT` inside)

Concern	Wrong Way	Right Way
Read-only function	Default `VOLATILE`	Mark `STABLE` or `IMMUTABLE`
Batch settlement	Loop in app, 50k round trips	`PROCEDURE` with chunked `COMMIT`
Reusable query	Copy SQL into every service	`FUNCTION ... RETURNS TABLE`
Audit trail	Manual inserts in every UPDATE	`AFTER` trigger function
Default values	Compute in app, then insert	`BEFORE` trigger mutates `NEW`
Error handling	`WHEN others THEN NULL`	Catch specific classes, re-raise rest
Transaction control	Try to `COMMIT` inside a function	Convert to `PROCEDURE`
Portability	PL/pgSQL everywhere	Keep heavy logic in app, thin in DB

Prev: Lesson 12.1 -- Views and Materialized Views Next: Lesson 12.3 -- Performance Optimization

This is Lesson 12.2 of the Database Interview Prep Course -- 12 chapters, 58 lessons.

Stored Procedures and Functions

LinkedIn Hook

What You'll Learn

The Restaurant Kitchen Analogy — Cooking at the Table vs in the Kitchen

PL/pgSQL — The Procedural Language Inside Postgres

CREATE FUNCTION — Returning Values, Used in SELECT

Example 1 — Scalar Function

Example 2 — RETURNS TABLE

Parameter Modes

CREATE PROCEDURE — Transaction Control, Called with CALL

Trigger Functions — Hooking Into Row Changes

Functions vs Procedures — The Defining Differences

The Honest Tradeoff — Database Logic vs Application Logic

Note on MySQL

Common Mistakes

Interview Questions

Quick Reference — Cheat Sheet

On this page