Stage 4 review
This page pulls Stage 4 together. The database now lives inside a running application - shared by many connections, exposed to untrusted input, cached, migrated, and operated. Work the scenarios before opening the answers, then take the quiz and keep the cheatsheet handy.
Go back to a lesson if a scenario stumps you: Access control, SQL injection, ORMs and migrations, Connections, Caching, Operations, Common data patterns.
Scenarios
Scenario 1 - Spot the injection and fix it
A search endpoint builds SQL like "SELECT * FROM products WHERE name LIKE '%" + term + "%'". Why is it dangerous, and what is the fix?
Show the answer
String concatenation lets the input break out of the string literal and inject SQL - e.g. term = "'; DROP TABLE products; --". The attacker now writes the query.
The fix is a parameterized query (prepared statement): send WHERE name LIKE ? with the value bound separately. The driver sends data and code on different channels, so input can never become SQL. For identifiers (table/column names), which cannot be parameters, use an allowlist. Add least-privilege so the app's role cannot drop tables anyway.
Scenario 2 - Size the connections
Your API runs 200 concurrent requests against a Postgres server with 8 cores. A junior sets the pool to 200 connections "to match." Why is that wrong, and what is the serverless twist?
Show the answer
A database does real work on CPU cores, not on idle connections. Hundreds of connections cause context-switching and memory overhead, making everything slower. A pool near (cores x a small factor) - often tens, not hundreds - usually beats a huge one. Requests queue briefly for a connection instead of overwhelming the server.
Serverless twist: many short-lived function instances each open their own connection and exhaust max_connections fast. Put an external pooler (PgBouncer, or a serverless driver like Neon's) between functions and the database so thousands of clients share a small pool.
Scenario 3 - Migrate without downtime
You must rename a column that a live, always-on service reads and writes. How do you do it without breaking running code?
Show the answer
Expand-contract (parallel change), never a single destructive rename:
- Expand - add the new column; backfill it; have the app write both old and new.
- Migrate reads - deploy code that reads the new column.
- Contract - once nothing reads the old column, stop writing it and drop it in a later migration.
Each step is backward-compatible, so old and new code run side by side during the rollout. Keep migrations reversible and test the rollback.
Scenario 4 - Pick the data pattern
For each need, name the pattern: (a) a feed that must page through millions of rows without skipping or repeating; (b) a "delete" that must remain auditable and recoverable; (c) a payment endpoint a client may retry on a network timeout.
Show the answer
- (a) Keyset (cursor) pagination - page by
WHERE id > last_seen ORDER BY id LIMIT n, notOFFSET. Offset re-reads and skips/duplicates rows as data shifts and gets slower deep in the list. - (b) Soft delete - set a
deleted_attimestamp and filter it out (a partial index keeps live-row queries fast). The row survives for audit and recovery. - (c) Idempotency key - the client sends a unique key; the server records it and returns the first result on retries (
INSERT ... ON CONFLICT DO NOTHINGon the key). The charge happens once.
Scenario 5 - What do you watch?
The service feels slow some afternoons. Name three things to look at first and one alert worth setting.
Show the answer
Look at: slow-query reports (pg_stat_statements ranked by total time), connection-pool saturation (are requests waiting for a connection?), and replication lag / cache hit ratio if you read from replicas or a cache. Check autovacuum / bloat if a table has degraded over time.
A good alert: p99 query latency (or API latency) crossing a threshold - it catches the tail users actually feel, unlike an average. Alert on symptoms users feel; dashboard the rest.
Comprehensive quiz
Stage 4 cumulative review
7 questions1The single most reliable defense against SQL injection is:
2Why is a smaller connection pool often faster than a very large one?
3An ORM lazily loads each order's customer in a loop over 100 orders, firing 101 queries. This is the:
4Which setting actually verifies you are talking to the real database server over TLS (not just encrypting)?
5In cache-aside, the hardest part in practice is:
6Offset pagination (LIMIT 20 OFFSET 100000) on a large, changing table is a poor choice because:
7RPO and RTO for a backup strategy mean, respectively:
Stage 4 cheatsheet
Open the Stage 4 cheatsheet
Security
- Always parameterize:
WHERE x = ?with bound values - never string-concatenate input. - Allowlist identifiers (table/column names can't be parameters).
- Least privilege: the app role can only do what the app needs (
GRANT/REVOKE, roles, RLS). - TLS:
sslmode=verify-fullto encrypt and verify identity. - Secrets in env vars / a secret manager (Vault, cloud) with rotation - never in source.
Connections
- Pool size near
(cores x small factor)- tens, not hundreds. - Serverless: front the DB with an external pooler / serverless driver.
Migrations
- Expand-contract: add + dual-write -> migrate reads -> drop old. Each step backward-compatible. Keep reversible.
Caching (cache-aside)
- Read cache; on miss read DB and populate. Write DB then invalidate.
- Hard part = invalidation; guard hot keys against stampedes.
Data patterns
- Keyset pagination:
WHERE id > :last ORDER BY id LIMIT n(not OFFSET). - Soft delete:
deleted_atcolumn + partial index; filter it out. - Idempotency: client key +
INSERT ... ON CONFLICT DO NOTHING; return first result on retry. - Audit: change-log table (trigger or app-level).
- Full-text search: Postgres
tsvector+ GIN, or SQLite FTS5; reach for a search engine at scale.
Operations
- Slow queries:
pg_stat_statementsby total time, thenEXPLAIN. - Health: autovacuum/bloat, replication lag, cache hit ratio.
- Backups: base backup + continuous WAL = PITR; set RPO/RTO; test restores.
- Alert on user-felt symptoms (p99 latency); dashboard the rest.
You can build, secure, and operate a relational database in a real app. Stage 5 - Beyond Relational asks the next question: when is a different kind of database the right tool?